Banno Usage Data
Introduction
The goal of this pipeline is to make the Banno Usage Data dataset (previously called Mixpanel dataset) available in JH-managed Google Big Query where it can be queried using the SQL interface for business purposes and to drive downstream business processes.
The usage pipeline is being updated to use the new schema. For the main table called “Events”, Unlike the old schema which is flat, the new schema consists of nested JSON column called “Properties”. The “Properties” column has various optional fields which may or may not be populated depending on the client telemetry coming in from Mixpanel. Detailed description of various schema properties collected is provided below.
Scenarios
The Banno Usage Data dataset consists of two tables, namely, events and people. Both tables hold multi-tenant data separated by a single key which is a GUID that uniquely identifies a specific Financial Institution. In addition, the name of the Financial Institution is also included in each row.
Pipelines
The Banno Usage pipeline consists of two different pipelines.
- The first pipeline writes raw datasets to Google Cloud Storage storage buckets in Jack Henry Data Hub Google project. This pipeline creates and manages the initial and subsequent incremental loading of data into the two tables in batch mode, typically once an hour.
- The second pipeline parses the raw data files from Google Cloud Storage (written by the previous pipeline) in realtime, flattens certain nested fields, and loads the data into Big Query tables in the Jack Henry Data Hub Google project.
Schema
master
Raw Name | Display Name | Description |
---|---|---|
$app_build_number | App Build Number | General build of the app. |
$app_version_string | App Version | Current app version. |
$bluetooth_enabled | Bluetooth | Set to true if Bluetooth is enabled, false if not. |
$bluetooth_version | Bluetooth Version | Set to “none”, “ble”, or “classic”. |
$brand | Brand | Device brand. |
$browser | Browser | Name of the browser. |
$browser_version | Browser Version | Version of the browser. |
$carrier | Carrier | Wireless carrier of the device owner. |
$city | City | The city of the event sender parsed from the IP property or the Latitude and Longitude properties. |
$current_url | Current URL | The URL of the page on which the event was tracked. |
$device | Device | Name of the event sender’s device, if they’re on mobile web. |
$device_id | Device ID | Autogenerated ID that is local to the device. Calling reset() regenerates this value. |
$duration | Duration | A property that is appended to an event to capture the elapsed time (in seconds) between a time_event and a track call. |
$google_play_services | Google Play Services | Verifies that Google Play services is installed and enabled on this device, and that the version installed on this device is no older than the one required by this client. |
$has_nfc | Has NFC | The device supports Near Field Communication (NFC). |
$has_telephone | Has Telephone | Set to true if this device has telephone functionality, false if not. |
$import | Import | Internal Mixpanel property set to true to indicate that events were sent through /import API. |
$initial_referrer | Initial Referrer | Referring URL when the user first arrived on your site. Defaults to “$direct” if the user is not referred. |
$initial_referring_domain | Initial Referring Domain | Referring domain at first arrival. Defaults to “$direct” if the user is not referred. |
$insert_id | Insert ID | A unique identifier for the event, used to deduplicate events that are accidentally sent multiple times. |
$lib_version | Library Version | Mixpanel library version. |
$manufacturer | Manufacturer | Device manufacturer. |
$model | Model | The model of the device. |
$mp_api_endpoint | API Endpoint | Mixpanel property to record the API endpoint the data was sent to:api.mixpanel.com - default ingestionapi-eu.mixpanel.com - EU data ingestionapi-in.mixpanel.com - India data ingestionapi-js.mixpanel.com - Javascript SDK |
$mp_api_timestamp_ms | API Timestamp | UTC timestamp in milliseconds when the event was received by our API. |
$os | Operating System | OS of the event sender. |
$os_version | OS Version | The current version of operating system on the device. |
$radio | Radio | Current cellular network communication standard (3G, 4G, LTE, etc). |
$referrer | Referrer | Referring URL including your own domain. Might not be present if document.referrer does not return a value. |
$referring_domain | Referring Domain | Referring domain including your own domain. Might not be present if document.referrer does not return a value. |
$region | Region | The region (state or province) of the event sender parsed from the IP property or the Latitude and Longitude properties. |
$screen_dpi | Screen DPI | Pixel density of the device screen. |
$screen_height | Screen Height | The height of the device screen in pixels / points (iOS). |
$screen_width | Screen Width | The width of the device screen in pixels / points (iOS). |
$search_engine | Search Engine | The search engine that the customer used when they arrived at your domain. |
$user_id | User ID | The identified ID of the user. Calling identify() sets this. |
$watch_model | Watch Model | The model of the iOS watch. |
$wifi | Wifi | Set to true if the user’s device has an active, available Wifi connection, false if not. |
mp_country_code | Country | The country of the event sender parsed from the IP property or the Latitude and Longitude properties. The value is stored as a 2-letter country code in the raw data and parsed into the country name in the UI. |
mp_keyword | Search Keyword | Search keywords detected on the referrer from a search engine to your domain. |
mp_lib | Mixpanel Library | The Mixpanel library that sent the event. |
mp_processing_time_ms | Time Processed (UTC) | UTC timestamp in milliseconds when the event was processed by Mixpanel servers. |
mp_sent_by_lib_version | Sent By Library Version | Mixpanel library version used to send data (not necessarily the same as the version which enqueued the data). |
utm_source, utm_medium, etc. | UTM Parameters | UTM tags derived from the URL a customer clicked to arrive at your domain. Each UTM will be collected under its own property. |
people
Raw Name | Display Name | Description |
---|---|---|
$android_app_version | Android App Version | Current Android app version (versionName). |
$android_app_version_code | Android App Version Code | Current Android app version code (versionCode). |
$android_brand | Android Brand | Android device brand. |
$android_lib_version | Android Lib Version | Last used Mixpanel Android / Unity library version. |
$android_manufacturer | Android Manufacturer | Android device manufacturer. |
$android_model | Android Model | The model of the Android device. |
$android_os | Android OS | Set when Android operating system was used. |
$android_os_version | Android OS Version | Last version of Android on the device. |
$browser | Browser | Name of the browser. |
$browser_version | Browser Version | Version of the browser. |
$city | City | The city of the user parsed from the IP property or the Latitude and Longitude properties. |
$country_code | Country Code | The country of the user parsed from the IP property or the Latitude and Longitude properties. |
$geo_source | Geo Source | Set to “reverse_geocoding” if profile geolocation properties were determined through Latitude and Longitude. |
$initial_referrer | Initial Referrer | Referring URL when the user first arrived on your site. Defaults to “$direct” if the user is not referred. |
$initial_referring_domain | Initial Referring Domain | Referring domain at first arrival. Defaults to “$direct” if the user is not referred. |
$ios_app_release | iOS App Release | The release or version number of the iOS app (CFBundleShortVersionString). |
$ios_app_version | iOS App Version | The version of the build that identifies an iteration of the iOS app (CFBundleVersion). |
$ios_device_model | iOS Device Model | The model of the iOS device. |
$ios_lib_version | iOS Lib Version | Last used Mixpanel iOS / Unity library version. |
$ios_version | iOS Version | Last version of iOS on the device. |
$last_seen | Updated at | The last time a user profile property was set or updated. |
$lib_version | Library Version | Last used Mixpanel library version (also applies to React Native and Flutter wrapper SDKs). |
$os | Operating System | Last OS of the event sender. |
$region | Region | The region (state or province) of the user parsed from the IP or Latitude and Longitude properties. |
$swift_lib_version | Swift Lib Version | Last used Mixpanel Swift library version. |
$timezone | Timezone | Timezone of the user parsed from the IP or Latitude and Longitude properties. |
initial_utm_source, initial_utm_medium, etc. | Initial UTM Parameters | UTM tags seen for the first time from the URL a customer clicked to arrive at your domain. Each UTM is a separate property. |
Querying the data
In the new schema, the data in the Events table can be queried and the corresponding JSON fields can be flattened as follows.
SELECT
DeviceId,
DistinctId,
EventName,
InsertId,
Time,
UserId,
InstitutionId,
InstitutionName,
Properties,
-- Nested JSON fields with $ prefix will need to be extracted using regex matching
REGEXP_EXTRACT(TO_JSON_STRING(Properties), r'"\$city":"([^"]+)"') AS city,
REGEXP_EXTRACT(TO_JSON_STRING(Properties), r'"\$region":"([^"]+)"') AS region,
-- Nested JSON fields with no $ prefix will need to be extracted using JSON extraction
JSON_EXTRACT_SCALAR(Properties, '$.languageCode') AS language_code,
JSON_EXTRACT_SCALAR(Properties, '$.mp_country_code') AS mp_country_code
FROM
<FI_Project>.<FI_Dataset>.events
LIMIT 10;