Simon Online

2021-10-22

Using RLS in Power BI Embedded

Power BI is a funny animal. On some levels it is a logical successor to SSRS but on other levels it is a totally different beast. One of the ways it differs greatly from SSRS is in handling parameters, especially secure parameters. When embedding an SSRS report you could specify the parameter value in a secure fashion and then now show the selector to end users.

In many cases there is a need to use row level security (RLS) to restrict the data that a user can see in Power BI. There are a myriad of ways to do this wrong and I think I’ve explored most of them over the last few days. There is also at least one way that works.

A tempting approach is to use a filter. These can be applied at render time in the browser by adding to the config when embedding the report.

const config = {
      visualName: '',
      type: 'report',
      accessToken: token, 
      embedUrl: token.embedUrl,
      tokenType: powerbi.models.TokenType.Embed,
      permissions: permissions,
      // 👇 filters
      filters: [
        {
          $schema: "http://powerbi.com/product/schema#basic",
          target: {
              table: "monthlyProducts",
              column: "userId"
          },
          operator: "In",
          values: ["stimms"]
        }
      ],
      // ☝️
      settings: {
        panes: {
          filters: {
            visible: true,
          },
          pageNavigation: {
            visible: true,
          },
        },
      },
    }

This class of parameter is fine for providing filters that can be updated later by the user. However, it should not be used for parameters that require some degree of security like a user name. These parameters are easily changed and, unless your parameter are in some way cryptographically secure there is a good chance you’re introducing a broken access control - #1 on the OWASP top 10.

Instead of this approach you can use the manage roles functionality in Power BI.

This functionality is designed to provide high level filters for data. A lot of the examples I see are for things like restricting a user in the East region from seeing the data of a user in the West region. This is done by assigning a role to that user when generating the embedding token. Then you’d set up a role for each region (see 1) and apply a filter expression to your tables so the only records with a region of East would show up.

This is a simplistic and somewhat tiresome approach to adding a role mapping. What if a new region like SouthEast is introduced? What isn’t, perhaps, as clear is that DAX expression can contain dynamic functions like UserName() which make filtering more powerful. This UserName() will be bound to the effective identity passed in

What I settled on for my filtering was to have a single role which I enforce at the embedded token generation level and then filter my data by the UserName() which I also set at the embedded token level. Because these are set at the embedded token generation time which occurs on the server I can be confident that I’m not providing a way for somebody to view data they shouldn’t.

The code for generation looks like this:

var tokenRequest = new GenerateTokenRequestV2(
                    reports: new List<GenerateTokenRequestV2Report>
                    {
                        new GenerateTokenRequestV2Report(reportId)
                    },
                    datasets: new List<GenerateTokenRequestV2Dataset> { new GenerateTokenRequestV2Dataset(report.DatasetId) },
                    identities: new List<EffectiveIdentity> { new EffectiveIdentity(user.ContactId.ToString(),
                                                                roles: new List<string> { "Users" },
                                                                datasets: new List<String>{report.DatasetId })
                    }
                );
var embedToken = await client.EmbedToken.GenerateTokenAsync(tokenRequest);

In this particular case the data I was returning from the database could be accessed by a number of different people depending on to which group they belonged. Initially I tried crossing the data against the user/group matrix but the cardinality of the resulting set was in the billions and totally overwhelmed both SQL Server and Power BI. Instead what I did was pull in the user/group matrix and the dataset that exposed the group id. In Power BI I did a merge of the datasets along with applying the row level filtering. This was necessary because, as far as I know, there is no way to pass the user name down to the actual query running against the server.

With all this in place I got good security and good performance. But, wow, did it take me a long time to get there.

2021-10-22

Mathplotlib cheat sheet

Part of an evolving cheat sheet

Include the library

import matplotlib.pyplot as plt

Set the plot size to be larger

plt.rcParams['figure.figsize'] = [30, 21]

Set plot title

plt.title("some title)

Plot a line chart

  plt.plot(filtered['datetime'], filtered['totalsales'], label="Sales Ingest")
  plt.show()

Add Legend

plt.legend()
2021-10-22

Pandas Cheat Sheet

Part of an evolving cheat sheet

Read a file

For CSV use

data = pd.read_csv('data.csv')

For excel use

df = pd.read_excel (r'./Detected as deleted from API data.xlsx')

You might need to install a library for xlsx files

pip install openpyxl

Get a summary of the data

Numerical summaries

data.describe()

Types of columns

data.dtypes

Create a new column from combining with another

This adds up values in two columns

data["totalqueuelength"] = data["Svc"] + data["Svc Que"]

This converts a couple of columns that have the data and time to a single field and turns it into a date

data["datetime"] = pd.to_datetime(data["Date"] + " " + data["Time"], format='%m/%d/%Y %I:%M %p')

Check date formats at https://docs.python.org/3/library/datetime.html#strftime-and-strptime-behavior

Create a new column from a function

df = df.assign(percent=lambda x: 100* x['deleted on date'] / x['total'])

Binning

You can bin the data by adding a new bin column to the dataframe

df['binning'] = pd.cut(df['percent'], 5) # 5 bins based on the percent column

Average a column

df['column'].mean()

Find row at an index

df.iloc[[int(totalRecords * .95)]] # Find the row at the 95th percentile

Filter data rows

data.loc[(data['datetime'] > '2021-10-16')]

Sort

df.sort_values(by=['percent'], ascending=False)
2021-10-12

Running Serverless Offline with a Self-Signed SSL Certificate

If you find yourself in need of running serverless locally using serverless offline and you want an SSL certificate then fear not, it’s not all that difficult. First you’ll need an SSL certificate. For our purposes you we’re going to use a self-signed certificate. This will cause browsers to complain but for local testing it isn’t typically a big problem.

Generate a Cert Using OpenSSL

You should install OpenSSL (or one of the more secure alternatives like LibreSSL) and then run

openssl req -newkey rsa:2048 -nodes -keyout key.pem -x509 -days 365 -out cert.pem

This will prompt you for a bunch of information about your organization. You can put anything you want in most of those fields but do pay attention to the Common Name field which needs to hold the value of localhost.

These are the answers I gave

Country Name (2 letter code) [AU]:US
State or Province Name (full name) [Some-State]:TX
Locality Name (eg, city) []:Austin
Organization Name (eg, company) [Internet Widgits Pty Ltd]:Inventive
Organizational Unit Name (eg, section) []:
Common Name (e.g. server FQDN or YOUR name) []:localhost
Email Address []:

You should now have a cert.perm and a key.pem in your local directory. Copy these into a cert folder at the root of your serverless project.

Running with the Certificate

Now you need to tell serverless where to find your certificate. You can either run with the flag

--httpsProtocol cert

or update your serverless.yml to include the cert directory

custom:
  serverless-offline:
    httpsProtocol: "cert"
    ...

Gotchas

If you’re seeing a warning about an invalid certificate then check that you’re accessing serverless via localhost and not 127.0.0.1 or 0.0.0.0. SSL works with domain names so you need to use one, even if it is just localhost.

2021-10-05

Configuration in Azure Isolated Functions

This is all done in the Program.cs. If you want to use the IOptions pattern which, let’s face it, everybody does. Then you can start with creating your configuration classes (I like more than one so config for different parts of the app remain logical distinct). These are POCOs

public class AuthConfiguration
{

    public string TenantId { get; set; }
    public string ClientId { get; set; }
    public string ClientSecret { get; set; }
    public string RedirectUrl { get; set; }
}

Then set this up in the host builder

var host = new HostBuilder()
            .ConfigureFunctionsWorkerDefaults()
            .ConfigureServices(container =>
            {
                container.AddOptions<AuthConfiguration>().Configure<IConfiguration>((settings, config) =>
                {
                    config.Bind(settings);
                });
            })
            .Build();

If this looks familiar it’s because it totally is! All of this uses the generic .NET host so this same sort of pattern should work in most .NET apps now.

2021-10-05

Closed form Fibonacci

A favorite programming test question is the Fibonacci sequence. This is defined as either 1 1 2 3 5... or 0 1 1 2 3 5... depending on what you feel fib of 0 is. In either case fibonacci is the sum of the two previous terms. So fib(10) = fib(9) + fib(8). The reason this is a programming test favorite is because it forces people to think about recursion and even memoization for performance.

There is however a shortcut for this and that is to use the closed form which uses the golden ratio. We have two interesting values called Phi and phi with the former being the golden ratio and the latter being a value observed in nature for leaf dispersions.

Phi = (1 + root(5))/2
phi = (1-root(5))/2

We can use this to create Binet’s formula (Jacques Philippe Marie Binet was a French mathematician born in 1786, although this formula is named for him it was really discovered by fellow French mathematician Abraham do Moivre a century earlier)

fib(n) = (Phi^n - phi^n)/(Phi - phi)

In code we can do the following to generate all the fib numbers up to n

static double Phi = (1 + Math.Pow(5,.5))/2;
static double phi = (1 - Math.Pow(5,.5))/2;
static ulong[] generateFibonaccisClosed(int n){
    ulong[] fib = new ulong[n];
    for(int i = 0; i<n; i++)
    {
        fib[i] = (ulong)((Math.Pow(Phi, i)-Math.Pow(phi, i))/(Phi - phi));
    }
    return fib;
}

The advantage here is that the closed form is constant time, constant memory and uses only about 3 64-bit values (fewer if you calculate phi and Phi as you go). It’s certainly a lot of fun to break out in a programming test.

2021-09-17

Filtering Datatables

Years back there was this crazy way of dealing with data in .NET called a DataSet. DataSets contained DataTables which contained a combination of DataRows and DataColumns. It was all loosely typed and keyed with strings. Basically a database inside of your process. Even when they were cool I felt uncomfortable using them. Because I sometimes maintain legacy code I run into these monstrosities.

Today’s problem was that I needed to filter the contents of a table before bulk loading it. You can actually do simple filtering using a quasi-SQL like

   var dataRows = existingDataTable.Select("UserName = 'simon'")

This gives you back a collection of DataRows which I guess you could inset back into the table after clearing it of rows. To make this useful there is an extension method called CopyToDataTable in System.Data.DataExtension (be sure to include the dll for this). Using that you can get back a full data table

var dataTable = existingDataTable.Select("UserName = 'simon'").CopyToDataTable();

In that same package, though, is a much better tool for filtering: converting the table to an IEnumerable. You still need to use some magic strings but that’s somewhat unavoidable in data tables.

var dt = existingDataTable.AsEnumerable()
                    .Where(r => r.Field<String>("UserName") == "Simon").CopyToDataTable();

You can also do more advanced things like checking to see if something is a GUID

var dt = existingDataTable.AsEnumerable()
                    .Where(r => Guid.TryParse(r.Field<String>("Id"), out var _)).CopyToDataTable();
2021-09-08

Installing VSFTP from source

The 3.0.3 version of VSFTP seems to have an exploit against it so you should update to 3.0.5. However at the current time that has not been released in Debian upstream so we get to compile it ourselves!

First get and decompress the source

wget https://security.appspot.com/downloads/vsftpd-3.0.5.tar.gz
tar xvf vsftpd-3.0.5.tar.gz
cd vsftpd-3.0.5

Now you’re going to need to edit the builddefs.h specifically you want to enable SSL with

 #define VSF_BUILD_SSL

You may need to install the open ssl headers

sudo apt install libssl-dev

Next run a standard make/make install

make
make install

That should be it. If you’re replacing an apt sourced vsftpd then remember to uninstall that and you will have to update the init.d script to point at the new one which is in /usr/local/sbin instead of /usr/sbin. You could also change the install prefix but I like local better.

2021-08-31

SQL Mail

Did you know you can hook up your SQL server (or Managed SQL on Azure) to an SMTP server and use it to send email. Terrible idea? Yes, probably. I really encourage people not to build business logic that might require creating an email into stored procs. Required for legacy code? Yes, certainly.

You first need to tell SQL server how to talk to the mail server. This is done using the sysmail_add_Account_sp

EXECUTE msdb.dbo.sysmail_add_account_sp
    @account_name = 'Database Mail Account',
    @description = 'SQL Server Notification Service',
    @email_address = 'SQLServer@somedomain.com',
    @replyto_address = 'SQLServer@somedomain.com',
    @display_name = 'Database Mail Profile',
    @mailserver_name = 'smtp.office365.com',
    @port = 587,
    @username = 'SQLServer@somedomain.com',
    @password = 'totallynotourpassword',
    @enable_ssl = 1;

With that in place you can set up a mail profile to use it

SELECT @sequence_number = COALESCE(MAX(profile_id),1) FROM msdb.dbo.sysmail_profile;

-- Create a mail profile
EXECUTE msdb.dbo.sysmail_add_profile_sp
    @profile_name = 'Database Mail Profile',
    @description = 'Sends email from the db';

-- Add the account to the profile
EXECUTE msdb.dbo.sysmail_add_profileaccount_sp
    @profile_name = 'Database Mail Profile',
    @account_name = 'Database Mail Account',
    @sequence_number = @sequence_number;

-- Grant access to the profile to the DBMailUsers role
EXECUTE msdb.dbo.sysmail_add_principalprofile_sp
    @profile_name = 'Database Mail Profile',
    @principal_id = 0,
    @is_default = 1 ;

You can then make use of sp_send_dbmail to send email

EXEC msdb.dbo.[sp_send_dbmail]
    @profile_name = 'Database Mail Profile',
    @recipients = 'simon.timms@somedomain.com',
    @subject = 'Testing db email',
    @body = 'Hello friend, I''m testing the database mail'

You can check the status of the sent email by querying

select * from msdb.dbo.sysmail_allitems

If things fail then checking the event log may be helpful

select * from msdb.dbo.sysmail_event_log 
2021-08-31

Reauthenticate with Nuget

If you have a private nuget feed authenticated with a password chances are your password will eventually expire or change. For some reason Visual Studio and perhaps nuget under the covers aggressively caches that password and doesn’t prompt you when the password doesn’t work anymore. To change the password the easiest approach I’ve found is to use the nuget.exe command line tool and run

c:\temp\nuget.exe sources update -Name "Teamcity" -Source "https://private.nuget.feed.com/httpAuth/app/nuget/feed/_Root/SomeThing/v2" -UserName "simon.timms" -Password "Thisisactuallymypassword,no,really"
C:\temp\nuget.exe list -Source teamcity