Developer Programs

Learn

Docs

Banno Usage Data

Pipelined Data Sets > Banno Usage Data

Introduction

The goal of this pipeline is to make the Banno Usage Data dataset (previously called Mixpanel dataset) available in JH-managed Google Big Query where it can be queried using the SQL interface for business purposes and to drive downstream business processes.

The usage pipeline is being updated to use the new schema. For the main table called “Events”, Unlike the old schema which is flat, the new schema consists of nested JSON column called “Properties”. The “Properties” column has various optional fields which may or may not be populated depending on the client telemetry coming in from Mixpanel. Detailed description of various schema properties collected is provided below.

Scenarios

The Banno Usage Data dataset consists of two tables, namely, events and people. Both tables hold multi-tenant data separated by a single key which is a GUID that uniquely identifies a specific Financial Institution. In addition, the name of the Financial Institution is also included in each row.

Pipelines

The Banno Usage pipeline consists of two different pipelines.

  • The first pipeline is owned and managed by Mixpanel and writes raw datasets to Google Cloud Storage storage buckets in Data Hub Google project. This pipeline creates and manages the initial and subsequent incremental loading of data into the two tables in batch mode, typically once an hour. JH provides the empty dataset and permissions to it for the DAA service account. DAA exposes a REST API that is used to create and manage the pipeline. The REST API is authenticated with a project key provided to JH leadership team. Details of the pipeline design and architecture, BigQuery objects, and service account details are provided here.
  • The second pipeline, built, owned and managed by Jack Henry, parses the raw data files from Google Cloud Storage (written by the previous pipeline) in realtime, flattens certain nested fields, and loads the data into Big Query tables in the Data Hub Google project.

Development

The REST API to create the Mixpanel BigQuery realtime export pipeline is documented in detail here.

The second pipeline is custom-built using .NET Cloud Functions, and it is available in the repo here.

Schema

Raw NameDisplay NameDescription
$app_build_numberApp Build NumberGeneral build of the app.
$app_version_stringApp VersionCurrent app version.
$bluetooth_enabledBluetoothSet to true if Bluetooth is enabled, false if not.
$bluetooth_versionBluetooth VersionSet to “none”, “ble”, or “classic”.
$brandBrandDevice brand.
$browserBrowserName of the browser.
$browser_versionBrowser VersionVersion of the browser.
$carrierCarrierWireless carrier of the device owner.
$cityCityThe city of the event sender parsed from the IP property or the Latitude and Longitude properties.
$current_urlCurrent URLThe URL of the page on which the event was tracked.
$deviceDeviceName of the event sender’s device, if they’re on mobile web.
$device_idDevice IDAutogenerated ID that is local to the device. Calling reset() regenerates this value.
$durationDurationA property that is appended to an event to capture the elapsed time (in seconds) between a time_event and a track call.
$google_play_servicesGoogle Play ServicesVerifies that Google Play services is installed and enabled on this device, and that the version installed on this device is no older than the one required by this client.
$has_nfcHas NFCThe device supports Near Field Communication (NFC).
$has_telephoneHas TelephoneSet to true if this device has telephone functionality, false if not.
$importImportInternal Mixpanel property set to true to indicate that events were sent through /import API.
$initial_referrerInitial ReferrerReferring URL when the user first arrived on your site. Defaults to “$direct” if the user is not referred.
$initial_referring_domainInitial Referring DomainReferring domain at first arrival. Defaults to “$direct” if the user is not referred.
$insert_idInsert IDA unique identifier for the event, used to deduplicate events that are accidentally sent multiple times.
$lib_versionLibrary VersionMixpanel library version.
$manufacturerManufacturerDevice manufacturer.
$modelModelThe model of the device.
$mp_api_endpointAPI EndpointMixpanel property to record the API endpoint the data was sent to:api.mixpanel.com - default ingestionapi-eu.mixpanel.com - EU data ingestionapi-in.mixpanel.com - India data ingestionapi-js.mixpanel.com - Javascript SDK
$mp_api_timestamp_msAPI TimestampUTC timestamp in milliseconds when the event was received by our API.
$osOperating SystemOS of the event sender.
$os_versionOS VersionThe current version of operating system on the device.
$radioRadioCurrent cellular network communication standard (3G, 4G, LTE, etc).
$referrerReferrerReferring URL including your own domain. Might not be present if document.referrer does not return a value.
$referring_domainReferring DomainReferring domain including your own domain. Might not be present if document.referrer does not return a value.
$regionRegionThe region (state or province) of the event sender parsed from the IP property or the Latitude and Longitude properties.
$screen_dpiScreen DPIPixel density of the device screen.
$screen_heightScreen HeightThe height of the device screen in pixels / points (iOS).
$screen_widthScreen WidthThe width of the device screen in pixels / points (iOS).
$search_engineSearch EngineThe search engine that the customer used when they arrived at your domain.
$user_idUser IDThe identified ID of the user. Calling identify() sets this.
$watch_modelWatch ModelThe model of the iOS watch.
$wifiWifiSet to true if the user’s device has an active, available Wifi connection, false if not.
mp_country_codeCountryThe country of the event sender parsed from the IP property or the Latitude and Longitude properties. The value is stored as a 2-letter country code in the raw data and parsed into the country name in the UI.
mp_keywordSearch KeywordSearch keywords detected on the referrer from a search engine to your domain.
mp_libMixpanel LibraryThe Mixpanel library that sent the event.
mp_processing_time_msTime Processed (UTC)UTC timestamp in milliseconds when the event was processed by Mixpanel servers.
mp_sent_by_lib_versionSent By Library VersionMixpanel library version used to send data (not necessarily the same as the version which enqueued the data).
utm_source, utm_medium, etc.UTM ParametersUTM tags derived from the URL a customer clicked to arrive at your domain. Each UTM will be collected under its own property.

Querying the data

In the new schema, the data in the Events table can be queried and the corresponding JSON fields can be flattened as follows.

  SELECT
    DeviceId,
    DistinctId,
    EventName,
    InsertId,
    Time,
    UserId,
    InstitutionId,
    InstitutionName,
    Properties,
    -- Nested JSON fields with $ prefix will need to be extracted using regex matching
    REGEXP_EXTRACT(TO_JSON_STRING(Properties), r'"\$city":"([^"]+)"') AS city,
    REGEXP_EXTRACT(TO_JSON_STRING(Properties), r'"\$region":"([^"]+)"') AS region,
    -- Nested JSON fields with no $ prefix will need to be extracted using JSON extraction
    JSON_EXTRACT_SCALAR(Properties, '$.languageCode') AS language_code,
    JSON_EXTRACT_SCALAR(Properties, '$.mp_country_code') AS mp_country_code
FROM
    sdb-dig-databroker-demoa8b4.garden_bank_banno_usage.events
    LIMIT 10;

Topics in this section


Have a Question?
Have a how-to question? Seeing a weird error? Get help on StackOverflow.
Register for the Digital Toolkit Meetup where we answer technical Q&A from the audience.
Last updated Mon Mar 10 2025