2021-10-22

Pandas Cheat Sheet

Part of an evolving cheat sheet

Read a file

For CSV use

data = pd.read_csv('data.csv')

For excel use

df = pd.read_excel (r'./Detected as deleted from API data.xlsx')

You might need to install a library for xlsx files

pip install openpyxl

Get a summary of the data

Numerical summaries

data.describe()

Types of columns

data.dtypes

Create a new column from combining with another

This adds up values in two columns

data["totalqueuelength"] = data["Svc"] + data["Svc Que"]

This converts a couple of columns that have the data and time to a single field and turns it into a date

data["datetime"] = pd.to_datetime(data["Date"] + " " + data["Time"], format='%m/%d/%Y %I:%M %p')

Check date formats at https://docs.python.org/3/library/datetime.html#strftime-and-strptime-behavior

Create a new column from a function

df = df.assign(percent=lambda x: 100* x['deleted on date'] / x['total'])

Binning

You can bin the data by adding a new bin column to the dataframe

df['binning'] = pd.cut(df['percent'], 5) # 5 bins based on the percent column

Average a column

df['column'].mean()

Find row at an index

df.iloc[[int(totalRecords * .95)]] # Find the row at the 95th percentile

Filter data rows

data.loc[(data['datetime'] > '2021-10-16')]

Sort

df.sort_values(by=['percent'], ascending=False)

2021-10-12

Running Serverless Offline with a Self-Signed SSL Certificate

If you find yourself in need of running serverless locally using serverless offline and you want an SSL certificate then fear not, it’s not all that difficult. First you’ll need an SSL certificate. For our purposes you we’re going to use a self-signed certificate. This will cause browsers to complain but for local testing it isn’t typically a big problem.

Generate a Cert Using OpenSSL

You should install OpenSSL (or one of the more secure alternatives like LibreSSL) and then run

openssl req -newkey rsa:2048 -nodes -keyout key.pem -x509 -days 365 -out cert.pem

This will prompt you for a bunch of information about your organization. You can put anything you want in most of those fields but do pay attention to the Common Name field which needs to hold the value of localhost.

These are the answers I gave

Country Name (2 letter code) [AU]:US
State or Province Name (full name) [Some-State]:TX
Locality Name (eg, city) []:Austin
Organization Name (eg, company) [Internet Widgits Pty Ltd]:Inventive
Organizational Unit Name (eg, section) []:
Common Name (e.g. server FQDN or YOUR name) []:localhost
Email Address []:

You should now have a cert.perm and a key.pem in your local directory. Copy these into a cert folder at the root of your serverless project.

Running with the Certificate

Now you need to tell serverless where to find your certificate. You can either run with the flag

--httpsProtocol cert

or update your serverless.yml to include the cert directory

custom:
  serverless-offline:
    httpsProtocol: "cert"
    ...

Gotchas

If you’re seeing a warning about an invalid certificate then check that you’re accessing serverless via localhost and not 127.0.0.1 or 0.0.0.0. SSL works with domain names so you need to use one, even if it is just localhost.

2021-10-05

Configuration in Azure Isolated Functions

This is all done in the Program.cs. If you want to use the IOptions pattern which, let’s face it, everybody does. Then you can start with creating your configuration classes (I like more than one so config for different parts of the app remain logical distinct). These are POCOs

public class AuthConfiguration
{

    public string TenantId { get; set; }
    public string ClientId { get; set; }
    public string ClientSecret { get; set; }
    public string RedirectUrl { get; set; }
}

Then set this up in the host builder

var host = new HostBuilder()
            .ConfigureFunctionsWorkerDefaults()
            .ConfigureServices(container =>
            {
                container.AddOptions<AuthConfiguration>().Configure<IConfiguration>((settings, config) =>
                {
                    config.Bind(settings);
                });
            })
            .Build();

If this looks familiar it’s because it totally is! All of this uses the generic .NET host so this same sort of pattern should work in most .NET apps now.

2021-10-05

Closed form Fibonacci

A favorite programming test question is the Fibonacci sequence. This is defined as either 1 1 2 3 5... or 0 1 1 2 3 5... depending on what you feel fib of 0 is. In either case fibonacci is the sum of the two previous terms. So fib(10) = fib(9) + fib(8). The reason this is a programming test favorite is because it forces people to think about recursion and even memoization for performance.

There is however a shortcut for this and that is to use the closed form which uses the golden ratio. We have two interesting values called Phi and phi with the former being the golden ratio and the latter being a value observed in nature for leaf dispersions.

Phi = (1 + root(5))/2
phi = (1-root(5))/2

We can use this to create Binet’s formula (Jacques Philippe Marie Binet was a French mathematician born in 1786, although this formula is named for him it was really discovered by fellow French mathematician Abraham do Moivre a century earlier)

fib(n) = (Phi^n - phi^n)/(Phi - phi)

In code we can do the following to generate all the fib numbers up to n

static double Phi = (1 + Math.Pow(5,.5))/2;
static double phi = (1 - Math.Pow(5,.5))/2;
static ulong[] generateFibonaccisClosed(int n){
    ulong[] fib = new ulong[n];
    for(int i = 0; i<n; i++)
    {
        fib[i] = (ulong)((Math.Pow(Phi, i)-Math.Pow(phi, i))/(Phi - phi));
    }
    return fib;
}

The advantage here is that the closed form is constant time, constant memory and uses only about 3 64-bit values (fewer if you calculate phi and Phi as you go). It’s certainly a lot of fun to break out in a programming test.

2021-09-17

Filtering Datatables

Years back there was this crazy way of dealing with data in .NET called a DataSet. DataSets contained DataTables which contained a combination of DataRows and DataColumns. It was all loosely typed and keyed with strings. Basically a database inside of your process. Even when they were cool I felt uncomfortable using them. Because I sometimes maintain legacy code I run into these monstrosities.

Today’s problem was that I needed to filter the contents of a table before bulk loading it. You can actually do simple filtering using a quasi-SQL like

   var dataRows = existingDataTable.Select("UserName = 'simon'")

This gives you back a collection of DataRows which I guess you could inset back into the table after clearing it of rows. To make this useful there is an extension method called CopyToDataTable in System.Data.DataExtension (be sure to include the dll for this). Using that you can get back a full data table

var dataTable = existingDataTable.Select("UserName = 'simon'").CopyToDataTable();

In that same package, though, is a much better tool for filtering: converting the table to an IEnumerable. You still need to use some magic strings but that’s somewhat unavoidable in data tables.

var dt = existingDataTable.AsEnumerable()
                    .Where(r => r.Field<String>("UserName") == "Simon").CopyToDataTable();

You can also do more advanced things like checking to see if something is a GUID

var dt = existingDataTable.AsEnumerable()
                    .Where(r => Guid.TryParse(r.Field<String>("Id"), out var _)).CopyToDataTable();

2021-09-08

Installing VSFTP from source

The 3.0.3 version of VSFTP seems to have an exploit against it so you should update to 3.0.5. However at the current time that has not been released in Debian upstream so we get to compile it ourselves!

First get and decompress the source

wget https://security.appspot.com/downloads/vsftpd-3.0.5.tar.gz
tar xvf vsftpd-3.0.5.tar.gz
cd vsftpd-3.0.5

Now you’re going to need to edit the builddefs.h specifically you want to enable SSL with

 #define VSF_BUILD_SSL

You may need to install the open ssl headers

sudo apt install libssl-dev

Next run a standard make/make install

make
make install

That should be it. If you’re replacing an apt sourced vsftpd then remember to uninstall that and you will have to update the init.d script to point at the new one which is in /usr/local/sbin instead of /usr/sbin. You could also change the install prefix but I like local better.

2021-08-31

SQL Mail

Did you know you can hook up your SQL server (or Managed SQL on Azure) to an SMTP server and use it to send email. Terrible idea? Yes, probably. I really encourage people not to build business logic that might require creating an email into stored procs. Required for legacy code? Yes, certainly.

You first need to tell SQL server how to talk to the mail server. This is done using the sysmail_add_Account_sp

EXECUTE msdb.dbo.sysmail_add_account_sp
    @account_name = 'Database Mail Account',
    @description = 'SQL Server Notification Service',
    @email_address = 'SQLServer@somedomain.com',
    @replyto_address = 'SQLServer@somedomain.com',
    @display_name = 'Database Mail Profile',
    @mailserver_name = 'smtp.office365.com',
    @port = 587,
    @username = 'SQLServer@somedomain.com',
    @password = 'totallynotourpassword',
    @enable_ssl = 1;

With that in place you can set up a mail profile to use it

SELECT @sequence_number = COALESCE(MAX(profile_id),1) FROM msdb.dbo.sysmail_profile;

-- Create a mail profile
EXECUTE msdb.dbo.sysmail_add_profile_sp
    @profile_name = 'Database Mail Profile',
    @description = 'Sends email from the db';

-- Add the account to the profile
EXECUTE msdb.dbo.sysmail_add_profileaccount_sp
    @profile_name = 'Database Mail Profile',
    @account_name = 'Database Mail Account',
    @sequence_number = @sequence_number;

-- Grant access to the profile to the DBMailUsers role
EXECUTE msdb.dbo.sysmail_add_principalprofile_sp
    @profile_name = 'Database Mail Profile',
    @principal_id = 0,
    @is_default = 1 ;

You can then make use of sp_send_dbmail to send email

EXEC msdb.dbo.[sp_send_dbmail]
    @profile_name = 'Database Mail Profile',
    @recipients = 'simon.timms@somedomain.com',
    @subject = 'Testing db email',
    @body = 'Hello friend, I''m testing the database mail'

You can check the status of the sent email by querying

select * from msdb.dbo.sysmail_allitems

If things fail then checking the event log may be helpful

select * from msdb.dbo.sysmail_event_log

2021-08-31

Reauthenticate with Nuget

If you have a private nuget feed authenticated with a password chances are your password will eventually expire or change. For some reason Visual Studio and perhaps nuget under the covers aggressively caches that password and doesn’t prompt you when the password doesn’t work anymore. To change the password the easiest approach I’ve found is to use the nuget.exe command line tool and run

c:\temp\nuget.exe sources update -Name "Teamcity" -Source "https://private.nuget.feed.com/httpAuth/app/nuget/feed/_Root/SomeThing/v2" -UserName "simon.timms" -Password "Thisisactuallymypassword,no,really"
C:\temp\nuget.exe list -Source teamcity

2021-08-13

Which Key to use for Managed Identity in Keyvault

I have a terraform deployment which runs in azure pipeline. Azure pipelines is connected to Azure via a service connection. This service connection is registered as an application in the Azure AD of the Azure account. The problem I constantly run into is that I can’t remember which id from the application should be granted keyvault access so the build pipeline can read and write to keyvault.

resource "azurerm_key_vault_access_policy" "terraformaccess" {
  key_vault_id = azurerm_key_vault.keyvault.id

  tenant_id = local.tenant_id
  object_id = ???????????????????

  key_permissions = [
    "Get",
    "Create",
    "List",
    "Update",
    "Verify",
    "Delete",
    "WrapKey",
    "UnwrapKey"
  ]

  secret_permissions = [
    "Get",
    "List",
    "Set"
  ]

  storage_permissions = [
    "Get",
    "List",
    "Set",
    "Update"
  ]
}

Maybe the value is in the portal somewhere:

Nope.

It seems to be findable by doing either

Login-AzureRmAccount -SubscriptionId <your subscription id>;
$spn=(Get-AzureRmADServicePrincipal -SPN <the application id>);
echo $spn.Id

or

 az ad sp list --spn <the application id>

Then look for ObjectId

2021-07-30

Stop Terraform Managing State for a Resource

Say you want to keep a resource but you want to stop terraform from managing it. You can ask terraform to update its state to forget about it. In my case I want terraform to forget it managing my Azure Static Web App because Terraform doesn’t support all the options I need and will clobber the app.

I can run this

terraform state rm "azurerm_static_site.agentportal"

If I decide to start managing the state again I can just run a terraform import to manage it again.

A blog about computer programming and technology.

My Books