Het bericht Masterdata Management verscheen eerst op Cusbi.
]]>Masterdata Management is the no.1 trend in today’s Business Intelligence industry surveys. This trend follows from the increasing decentralization of data with the rise of micro-services, unstructured data, external data, and the increased popularity of self-service analytical applications.
These developments may negatively impact data quality, resulting in a degradation of trust in the data. In order to become (and stay) a data-driven organization data quality has to be ensured.
To solve this issue organizations implement single trusted applications that are the master of a data entity, for example customer data. This is often referred to as Masterdata Management (MDM) and a way to enable a ‘Single version of the truth’.
The cusbi data analytics platform is designed around the use of master data, because it makes it much easier and effective to create great business insights. Cusbi can also help organizations to use master data, by advising and providing support in this transformation, and facilitate in creating master data management applications.

With cusbi you control the data in the machine.
Before getting into Masterdata Management, a quick look back how this became such an important trend in the Business Intelligence industry.
To keep competitive in today’s data-driven world people need answers fast for their business challenges. This becomes harder in today’s connected world where data is becoming more decentralized. A data entity such as customer information might be sourced from a multitude of systems. This can include unstructured and external data sources such as social media channels, suppliers and distributors.
IT departments have a hard time to meet these changing demands and environments and at times are left somewhat behind due to the pace of change.
One of the solutions to solve this issue is the rise of self-service data analytics and visualization tools. Organizations are putting these user-friendly applications directly in the hands of end-users and give them the power to solve their business problems. Surveys indicate that currently around 55% of organizations report adopting self-service BI tools (1).

Self service analystics and visualisation tools have empowered end-users (Photo by Burak K from Pexels)
At cusbi we encourage and support this democratization of data, allowing organizations to more effectively reach their business goals. But we also recognize some of the new challenges this brings..
With the larger variety in data sources, in combination with the rise of self-service tools, the decentralization of data results in several new challenges including:
The overall effect is that data quality is degrading. The democratization and decentralization thus results in less trust in the information by the users. This has led to poor information reliability being reported as the biggest obstacle for businesses to monetize data (with data silo-ing also coming in the top 5). (2)
The concerns about data quality have changed the subjects currently on top of mind in our industry. The self-service and data visualization tools that were for long a hot topic for BI platforms are now seen as a commodity (3).
The biggest challenge now is how to get a ‘Single version of the truth’ in organizations.
Implementing Masterdata Management (MDM) solutions is one of the best ways to improve your data.
Classic examples of master data are customer or product data. The goal of Masterdata Management is to create a single data master that is then shared across the organization. A customer master for example ensures that the definitions and the quality controls around that entity are all the same and will behave the same wherever it is used.
The importance of Masterdata Management in solving the data quality challenges is reflected in the industry surveys. Masterdata and Data Quality Management are now the no. 1 trend for the few last years. (4)

Extract from infographic: “Data, BI & Analytics Trend Monitor 2021” (http://barc-research.com/data-bi-analytics-trends-infographic-2021/).
The goal of Masterdata Management is to create a single master, consisting of clear and consistent data entity definitions. Key elements to achieve this are removing duplicate data and combining multiple (incomplete) data records to create one clear data entity. This applies to both existing data in an organization and new data being generated.
The reference master data needs to integrate this with multiple systems in the organization. This is often the one of the bigger challenges, especially when dealing with legacy systems or systems that are a black box to the organization.
But Masterdata Management is more than only technology and architecture, it requires a more all-encompassing approach. Successful Masterdata Management goes hand-in-hand with establishing good Data Governance in organizations, such as ownership and policies for the various data sources. Furthermore, data quality should be assured by implementing best practices such as the Data Quality Cycle.
The cusbi team and cusbi platform can help you to create an excellent Masterdata Management solution. Some of the elements provided in our solution:
One of our clients uses the cusbi platform to perform audits for dozens of supermarkets which are themselves part of various supermarket formulas. The accounting packages from each supermarket are unified to the same data entity definitions. From this unified accounting scheme, the audit reports can be produced which are customized for each supermarket. The users of the accountancy firm have direct control of master data to map the various data sources and reports for each supermarket with the unified data entities.
The all-encompassing approach of adopting Masterdata Management in an organization can be intimidating and causes many to hold off. It impacts the technology, governance and processes in an organization. Masterdata Management is difficult, but the rewards are great. The savings in resources, the quality of your information and simplification of your administration by being able to manage data from a single point are tangible.
Our experience has shown that successful Masterdata Management adoptions are the result of “just getting started”, even if your datasets are relatively small.
We at cusbi are ready to help you in your adoption. Reach out to us at [email protected]l and let’s get you started!
(1) Self-Service BI: An Overview (https://bi-survey.com/self-service-bi)
(2) Data trust pulse survey results 2019 – PWC.
(3) Magic Quadrant for Analytics and Business Intelligence Platforms – February 2020, Gartner.
(4) Data, BI and Analytics Trend Monitor 2021 – November 2020, BARC
(5) OpenAPI Initiative (www.openapis.org)
Het bericht Masterdata Management verscheen eerst op Cusbi.
]]>Het bericht An example of speeding up the rendering of your Django templates verscheen eerst op Cusbi.
]]>def nr_failures(self): last_year = make_aware(datetime.today() - timedelta(days=365), get_current_timezone()) return FailureMaintenance.objects.filter(apparaat=self, type__failure_maintenance='F', start_dt__gte=last_year, ).count() def nr_maintenance(self): last_year = make_aware(datetime.today() - timedelta(days=365), get_current_timezone()) return FailureMaintenance.objects.filter(apparaat=self, type__failure_maintenance='M', start_dt__gte=last_year, ).count()
Views.py
eq = PieceOfEquipment.objects.filter(client=client) return render(request, 'meldman/overview_equipm.html', {'eq': eq})
HTML template
<td style="text-align:left”> <a href="proxy.php?url={% url 'some_url' %}?device={{ eq.id }}">{{ eq.nr_failures }} <td style="text-align:left”> <a href="proxy.php?url={% url 'some_url' %}?device={{ eq.id }}">{{ eq.nr_maintenance }}
Beautiful, relative simple code with the logic stored in the model (as most blogs tell you to do). Well done, pat on the shoulder.
And this worked fine, however, somehow clients seemed to register more and more equipment which they tracked in the “Meldman” (which I guess can be seen is good news, seemingly they like the application…). The result was that the peformance began te degrade to a level which was not acceptable anymore. So, what was happening? Using the Django-debug-toolbar it was quickly evident that one (expensive) query was used over-and-over again. If a client has 300 pieces of equipment a similar query would run 300 time for calculating the “nr_failures” and 300 times for calculating the “nr_maintenance”.
So, what did we change:
1) we stopped using the properties on the model
2) we changed the view to:
from django.db.models import Count, Q last_year = make_aware(datetime.today() - timedelta(days=365), get_current_timezone()) eq = Equipment.objects.filter(Q(klant=klant) & Q(FailureMaintenance__start_dt__gte=last_year)) eq = eq.annotate(nr_failures=Count('failuremaintenance', filter=Q(failuremaintenance__type__failure_maintenance='F'))) eq = eq.annotate(nr_maintenance=Count('failuremaintenance', filter=Q(failuremaintenance__type__failure_maintenance='M'))) return render(request, 'meldman/overview_equipm.html', {'eq': eq})
Now, no matter how many pieces of equipment a client has only one query (yes, “1”, as in “the first positive integer”) is executed. Obviously this solved the performance problem quite handsomely.
All in all a fairly painless exercise with a huge benefit.
Het bericht An example of speeding up the rendering of your Django templates verscheen eerst op Cusbi.
]]>Het bericht History of objects verscheen eerst op Cusbi.
]]>In these systems people (both customers and employees) enter data. They create (C) new records, they inspect records (read, R) and they update (U) records. They sometimes delete (D) records, more on that later. The astute reader has recognized the infamous “CRUD” acronym. In general we are not interested in the “R” part: every screen shown to customers and employees is littered with the data, keeping a record who has seen what field, when, and what the value of the field was at the time would create a deluge of history which is (as far as I can see) useless.
So what is it exactly we want? For each record ever produced in the database (in Django terms: for each object of a model ever created, updated or modified) we want to know for each transaction:
Let’s dig into an example and iterate through a number of solutions.We define 4 employees as a choice list and we have an oversimplified payment table (e.g. missing in this table is to whom we make the payment). The models.py looks like:
from django.db import models class TransactionDT(models.Model): """ To be added to (nearly) every model. Last_modified_user to be filled in the admin and in the views.""" modified_dt = models.DateTimeField(auto_now=True, null=True) # filled by Django last_modified_user = models.ForeignKey('auth.User', verbose_name='Was changes by', null=True, blank=True) class Meta: abstract = True class Payment(TransactionDT): EMPLOYEE_CHOICES = (('A', 'Ada'), ('B', 'Ben'), ('C', 'Carl'), ('D', 'Daisy')) employee = models.CharField(max_length=1, choices=EMPLOYEE_CHOICES, null=False) amount = models.DecimalField(max_digits=12, decimal_places=2, null=False, default=0) payment_dt = models.DateTimeField(help_text='when the payment will be executed', null=False) class Meta: unique_together = ('employee', 'payment_dt') def __str__(self): return '{0}, ${1}, {2}'.format(self.employee, self.amount, self.payment_dt)
and the corresponding admin.py:
from django.contrib import admin from yourapp.models import Payment class LastUser: """mixin for all Admin classes to fill the last_modified_user of the mixin TransactionDT from the models.""" def save_model(self, request, object, form, change): object.last_modified_user = request.user object.save() class PaymentAdmin(LastUser, admin.ModelAdmin): list_display = ('employee', 'payment_dt') list_display_links = list_display ordering = list_display list_filter = ('employee',) search_fields = list_display fieldsets = (('Key', {'classes': ('wide', 'extrapretty'), 'fields': ('employee',) }), ('Value', {'classes': ('wide', 'extrapretty'), 'fields': ('amount', 'payment_dt') }), ) admin.site.register(Payment, PaymentAdmin)
Run makemigrations and migrate. Now we need some users to be able to log into the admin. I like to create my test data with scripts; see the bottom of this blog for the complete script.
Start Runserver and log into the admin with one of the created users. You can now add, edit and delete payments. Not only that, but at the top right there is this intriguing button “history”

Pressing this button may give something like:

Upon inspecting the database one sees a table called “django_admin_log”; I leave it to the reader to have a look at the contents of this table. The name of the table says it all: the changes shown are the changes made in the admin, not changes made via user screens, via an API or via raw SQL (your DBA running some update / repair script outside of Django). In summary: nice, however for our goals by far not extensive enough.
Searching the internet we come across the very useful page:
https://djangopackages.org/grids/g/model-audit
Obviously, we are not the first ones to hit upon this requirement. It seems most, if not all, of the packages listed use either middleware or signals or both. I had a look at Reversion, which seems well maintained and the most used. Changes made via Admin screens, via user screens and even via APIs will be registered, however, changes made by external SQL scripts will not end up in the history tables.
So, what are we left with? To ensure all CUD are always logged we have to look outside of Django, we have to ensure that at database level the changes are registered. Triggers (https://en.wikipedia.org/wiki/Database_trigger) are the way to go. We are going to add a table which is going to be updated outside of Django itself using these triggers. However, we do want to make use of the wonderful migrations feature of Django; when a model changes we want the history table to change as well without us even thinking about it… the best of boths worlds… the free lunch…For this example we now modify our models to:
from django.db import models class Mutation(models.Model): """ To be added to models where you want complete history. If a record is CUD by Django (admin or frontend) then mut_dt = modified_dt (give or take a few thousands of a second). However, if record is CUD outside of Django modified_dt will be NULL whereas mut_dt is still filled by trigger on database. Making mut_dt the primary key prevents the creation of "id" with a database sequence in the audit tables. The field named "id" is the copy of the id of the payment table.""" MUTATION_CHOICES = (('C', 'Create'), ('U', 'Update'), ('D', 'Delete')) mut_type = models.CharField(max_length=1, choices=MUTATION_CHOICES, null=False) mut_dt = models.DateTimeField(primary_key=True, null=False) id = models.IntegerField(null=False) class Meta: abstract = True class TransactionDT(models.Model): """ To be added to (nearly) every model. Last_modified_user to be filled in the admin and in the views.""" modified_dt = models.DateTimeField(auto_now=True, null=True) # filled by Django last_modified_user = models.ForeignKey('auth.User', verbose_name='Was changes by', null=True, blank=True) class Meta: abstract = True class PaymentAbstract(TransactionDT): EMPLOYEE_CHOICES = (('A', 'Ada'), ('B', 'Ben'), ('C', 'Carl'), ('D', 'Daisy')) employee = models.CharField(max_length=1, choices=EMPLOYEE_CHOICES, null=False) amount = models.DecimalField(max_digits=12, decimal_places=2, null=False, default=0) payment_dt = models.DateTimeField(help_text='when the payment will be executed', null=False) class Meta: abstract = True unique_together = ('employee', 'payment_dt') class Payment(PaymentAbstract): def __str__(self): return '{0}, ${1}, {2}'.format(self.employee, self.amount, self.payment_dt) class AuditPayment(Mutation, PaymentAbstract): """ The order of the fields in the database table is important for the insert statement of the stored procedure: mut_type, mut_dt followed by exactly the same variables as in Payment. Therefore the mixin Mutation has mut_type, mut_dt, id in it followed by the variables of PaymentAbstract.""" def __str__(self): return '{0}, {1}, {2}, ${3}, {4}'.format(self.mut_dt, self.mut_type, self.employee, self.amount, self.payment_dt)
The admin.py is OK as it is; no need to change anything here.
On the database (PostgreSQL in our case) we create the trigger on the payment table in order to fill the auditpayment table:
CREATE OR REPLACE FUNCTION log_history() RETURNS trigger AS $BODY$ BEGIN IF (TG_OP = 'INSERT') THEN EXECUTE 'INSERT INTO app2hist_auditpayment SELECT ''C'', now(), ($1).*' USING NEW; ELSIF (TG_OP = 'UPDATE') THEN RAISE NOTICE 'Tabel %', tg_relname || OLD.* ; EXECUTE 'INSERT INTO app2hist_auditpayment SELECT ''U'', now(), ($1).*' USING NEW; ELSIF (TG_OP = 'DELETE') THEN EXECUTE 'INSERT INTO app2hist_auditpayment SELECT ''D'', now(), ($1).*' USING OLD; END IF; RETURN NULL; -- result is ignored since this is an AFTER trigger END; $BODY$ LANGUAGE plpgsql VOLATILE COST 100; ALTER FUNCTION log_history() OWNER TO postgres; CREATE TRIGGER app2hist_payment AFTER INSERT OR UPDATE OR DELETE ON app2hist_payment FOR EACH ROW EXECUTE PROCEDURE log_history();
So what is stored in the auditpayment table if we create a payment, do two updates on it and delete it? Something like:

If we now make a model change, run makemigrations and migrate, will that work? The answer is: yes, it will. Let’s say we add a receiver to the payment model:
class Payment2Abstract(TransactionDT): EMPLOYEE_CHOICES = (('A', 'Ada'), ('B', 'Ben'), ('C', 'Carl'), ('D', 'Daisy')) RECEIVER_CHOICES = (('X', 'Alaxandra'), ('Y', 'Young'), ('Z', 'Zaanen')) employee = models.CharField(max_length=1, choices=EMPLOYEE_CHOICES, null=False) amount = models.DecimalField(max_digits=12, decimal_places=2, null=False, default=0) payment_dt = models.DateTimeField(help_text='when the payment will be executed', null=False) receiver = models.CharField(max_length=1, choices=RECEIVER_CHOICES, null=True) class Meta: abstract = True unique_together = ('employee', 'payment_dt')
and makemigrations / migrate this change. The result is that our table and our update table will be updated in such a way that the trigger will keep on doing it’s job just as intended.

Et voila (as the French can say so beautifully), in a nutshell, that is it. All changes on your objects, done via Django or outside of Django, will be reflected in the audit table.
There are a number of improvements one can think of when one wants to run this on production scale:
Both of these may be the subject of new blogs.
Colleague Marco Oskam for his intimate knowledge of databases. Writing the trigger / stored procedure is his thing. Also automating the creation of these on production scale.Alexander Kojevnikov for “hilite.me”. I used it for formatting the Python / SQL code to nice looking HTML. I used Style Colorful.
import django from django.contrib.auth.models import Group, Permission, User django.setup() # in Pycharm, needs to be done, your miles may vary !! def ins_group(nm): k = Group() k.name = nm k.save() k.permissions.add(Permission.objects.get(codename='add_payment')) k.permissions.add(Permission.objects.get(codename='change_payment')) k.permissions.add(Permission.objects.get(codename='delete_payment')) return def ins_user(us_nm, us_email, us_passw, fn, ln, gr_nm, staff, superuser): us1 = User.objects.create_user(us_nm, us_email, us_passw) us1.first_name = fn us1.last_name = ln us1.is_staff = staff us1.is_superuser = superuser us1.save() g = Group.objects.get(name=gr_nm) g.user_set.add(us1) return User.objects.all().delete() Group.objects.all().delete() ins_group('employee') # run only on development machine !! these passwords en emails are not strong ins_user('Super_1', '[email protected]', 'aabbcc', 'Super', 'Maarten', 'employee', True, True) ins_user('Emp_1', '[email protected]', 'aabbcc', 'Emp', 'Zaanen', 'employee', True, False) print('OK Created users')
Het bericht History of objects verscheen eerst op Cusbi.
]]>Het bericht Use of permissions in admin for groups verscheen eerst op Cusbi.
]]>In the documentation of Django for “permissions” it says:
Permissions are set globally per type of object, not per specific object instance. It is possible to say “Mary may change news stories,” but it’s not currently possible to say “Mary may change news stories, but only the ones she created herself” or “Mary may only change news stories that have a certain status or publication date.
Well, actually, something like that is what we want to do; in our case we want Mary to change the Museums and Buildings of the city she is a “is_staff” member for (i.e. allowed to enter Admin and have certain permissions to edit/add/delete).
Let’s sum up the boundary conditions we have for our new application on the Admin side of thing.
An example to explain usually works best; see below.
For the cities of Utrecht (these days famous for the start of the Tour the France), London and Paris (both just famous), we have models for Museum(s) and Building(s). Two models are the representation of condition number 1 above (an application with many models).
The Models.py:
from django.contrib.auth.models import AbstractUser from django.db import models class Museum(models.Model): client = models.ForeignKey('auth.Group') museum_nm = models.CharField(max_length=60, null=False) def __str__(self): return '{0}, {1}'.format(self.client, self.museum_nm) class Building(models.Model): client = models.ForeignKey('auth.Group') building_nm = models.CharField(max_length=60, null=False) height = models.IntegerField(null=True) def __str__(self): return '{0}, {1}, {2}'.format(self.client, self.building_nm, self.height)
On to the Admin.py. We introduce the class “GetClientData”. The mixin class GetClientData limits the objects seen in the list view to the objects which belong to the city of the staff member.
from django.contrib import admin from .models import Museum, Building from .forms import MuseumForm, BuildingForm class GetClientData: """ Limit apparatus / building list view to the objects that belong to the request's user group(s) (= client(s)). Superuser sees all. Thank you: http://reinout.vanrees.org/weblog/2011/09/30/django-admin-filtering.html """ def get_queryset(self, request): qs = super(GetClientData, self).get_queryset(request) return qs if request.user.is_superuser else \ qs.filter(client__in=request.user.groups.all()) class MuseumAdmin(GetClientData, admin.ModelAdmin): form = MuseumForm admin.site.register(Museum, MuseumAdmin) class BuildingAdmin(GetClientData, admin.ModelAdmin): form = BuildingForm admin.site.register(Building, BuildingAdmin)
Ok, so now our staff members only sees the list of her/his city.
Below we see the screen shot for the Paris_user for the listview:

And for the superuser:

But what if the Paris_user clicks on one of the objects to edit it? What they will see in the edit screen is a dropdown list of the cities containing ALL the cities. This is not what we want. We want to limit the dropdown box for the edit and add views to the city of the staff member (except for the superuser, (s)he should see a dropdown box with all the cities).
Back to the forms.py and let’s limit the dropdown box. In order to do this we must know our user. There is an excellent package I have used multiple times to good effect to ALWAYS know who the owner of the session is: cuser. Get it from PyPI. This is middleware so you have to do just a little more than just installing it; you have to make 2 modifications to your settings.py.
Once you have installed it modify forms.py as follows:
from django.forms import ModelForm from .models import Museum, Building from django.contrib.auth.models import Group from cuser.middleware import CuserMiddleware class GroupDropDown: """ Limit the dropdown box for the change / add screens to the city (=groups) the user belongs to. Thank you: https://groups.google.com/forum/?fromgroups=#!topic/django-users/s9a0J6fKgWs """ def __init__(self, *args, **kwargs): super(GroupDropDown, self).__init__(*args, **kwargs) # gets the logged in user, even if there is no request user = CuserMiddleware.get_user() # transform a many-2-many query to a list of ids gps = [x.id for x in user.groups.all()] # to which groups does (s)he belong? superuser belongs to all. groups = Group.objects.filter(id__in=gps) \ if not user.is_superuser else Group.objects.all() group_choices = [] if groups is None: # this is not good if this happens... a staff member without a city group_choices.append(('', '---------')) for group in groups: group_choices.append((group.id, group.name)) self.widget = self.fields['client'].widget self.widget.choices = group_choices class MuseumForm(GroupDropDown, ModelForm): class Meta: model = Museum fields = '__all__' class BuildingForm(GroupDropDown, ModelForm): class Meta: model = Building fields = '__all__'
The GroupDropDown is again made as a mixin to avoid duplicating code for each and every class.
Et voila, as the French say, this has done the trick.
The dropdown box in the edit view for the Paris_user:

The dropdown box for the superuser:

I hope this is useful,
Greetings,
Maarten Zaanen
Thank you Alexander Kojevnikov for “hilite.me” for formatting the Python code to nice looking HTML. I used Style Native.
Below the code for fill_data_initial.py
# in Django console at the prompt run: execfile('fill_data_initial.py') # Import general Python stuff # import Django stuff import django django.setup() # in Pycharm, needs to be done, your miles may vary !! from django.contrib.auth.models import Group, User, Permission from django.db.models import Q # Import own stuff from tteesstt.models import Museum, Building def ins_group(nm): k = Group() k.name = nm k.save() # can not add more than 1 with the .add statement, never mind, # this is not the issue in this blog k.permissions.add(Permission.objects.get(codename='add_museum')) k.permissions.add(Permission.objects.get(codename='change_museum')) k.permissions.add(Permission.objects.get(codename='add_building')) k.permissions.add(Permission.objects.get(codename='change_building')) return Group.objects.all().delete() ins_group('Paris') ins_group('London') ins_group('Utrecht') print('OK Groups') def ins_user(us_nm, us_email, us_passw, fn, ln, gr_nm): """ create users for accounts, not superusers, they are created seperately. """ us1 = User.objects.create_user(us_nm, us_email, us_passw) us1.first_name = fn us1.last_name = ln us1.is_staff = True us1.save() g = Group.objects.get(name=gr_nm) g.user_set.add(us1) return # User.objects.filter(~Q(username='a')).delete() # do not delete superuser 'a' created manually User.objects.all().delete() # also delete superuser # make superuser User.objects.create_superuser(username='a', password='a', email='[email protected]') # make staff users ins_user('Paris_user', '[email protected]', 'P', 'Jean', 'Lafitte', 'Paris') ins_user('London_user', '[email protected]', 'L', 'Mary', 'Scott', 'London') ins_user('Utrecht_user', '[email protected]', 'U', 'Maarten', 'Zaanen', 'Utrecht') print('OK Created users') def ins_museum(cl, nm): g = Museum() g.client = Group.objects.get(name=cl) g.museum_nm = nm g.save() return Museum.objects.all().delete() ins_museum('Paris', 'Centre Georges Pompidou') ins_museum('Paris', 'Musée d''Orsay') ins_museum('London', 'Science Museum') ins_museum('London', 'The British Museum') ins_museum('Utrecht', 'Museum Speelklok') ins_museum('Utrecht', 'Spoorwegmuseum') print('OK Museum') def ins_building(cl, nm, ht): g = Building() g.client = Group.objects.get(name=cl) g.building_nm = nm g.height = ht g.save() return Building.objects.all().delete() ins_building('Paris', 'Eiffel Tower', 324) ins_building('Paris', 'Montparnasse Tower', 220) ins_building('London', 'Big Ben', 96) ins_building('London', 'London Eye', 135) ins_building('Utrecht', 'de Dom', 111) ins_building('Utrecht', 'Rabo', 105) print('OK Buildings')
Het bericht Use of permissions in admin for groups verscheen eerst op Cusbi.
]]>