Upsight logo Back to top

Downloading Raw Data Files

Requirements


  • You must have an enterprise account
  • You must have credentials to log in to this account
  • You must know the API key of the app you want to fetch data for

Overview


The application provides raw data from the last 60 days in one or more files per hour. A list of files available for an app for an hour can be fetched using any of the following formats

http://analytics-api.upsight.com/data/raw_data/<API_KEY>/<YEAR>/<MONTH>/<DAY>/<HOUR>/
  • Replace <API_KEY> with the 32-character API key for the application (available on the Upsight Analytics Dashboard under "Manage Apps").
  • Replace <YEAR> with the 4-digit year.
  • Replace <MONTH> with the 2-digit month.
  • Replace <DAY> is the 2-digit day of the month.
  • Replace <HOUR> is the 2-digit hour in 24-hour format (relative to UTC).

Example:

http://analytics-api.upsight.com/data/raw_data/0123456789abcdef0123456789abcdef/2010/01/01/00/

There are three formats available:

html -- Displays the data in a format appropriate for a browser and contains links to each file that is available. This format is appropriate if using a tool (such as wget) in recursive mode to get all files for an hour. text -- URLs to files for hour, separated by carriage return (\n) json -- JSON-encoded list of URLs to files for hour

The format can be set with the format parameter as shown in the example below. The format defaults to html if an explicit format is not specified.

http://analytics-api.upsight.com/data/raw_data/0123456789abcdef0123456789abcdef/2010/01/01/00/?format=text

Downloading Files Using a Browser


  1. Enter the appropriate URL in the address bar, specifying the API key and year/month/day/hour
  2. If prompted to, enter login credentials
  3. Click on the links to download the available files

Downloading Files Using wget


Specify the -r, --user, and --password options when using wget to download all the files for an hour. For example:

wget -r --user=USERNAME --password=PASSWORD
http://analytics-api.upsight.com/data/raw_data/0123456789abcdef01232456789abcdef/2010/01/01/00/

Using the -r option instructs wget to automatically download all of the files available for that hour.

File Format


Compression

A data file is compressed using gzip.

Sections

A data file contains two sections:

  • A two-line header
  • A body containing one or more messages

There is no separator between the two sections

Header Format

Each header line starts with a # character. The first line identifies the version of the data file format using the following format:

# KTRAW <VERSION>

The current <VERSION> is 0.1. For this format, the header has a fixed length of 2 lines. The second line lists:

  • the API key the data is associated with,

  • the Upsight data capture API format used, and

  • the UTC hour that the file contains data for.

Note The UTC hour is specified as a number of seconds from the start of epoch (1/1/1970 00:00:00).

A sample line for the application with the API key 0123456789abcdef0123456789abcdef for hour 1/1/2010 00:00:

# 0123456789abcdef0123456789abcdef v1 1262304000

Message Format

Each line after the header contains a message received by Upsight's data capture servers for the application. The line is space-delimited and contains the following fields:

  • time stamp -- time relative to start of hour, in seconds
  • message type -- 3-character message name (see Upsight Data Collection API documentation for valid message types)
  • message parameters -- ampersand-delimited list of parameters for the message
  • source address -- IP address that the message was received from
  • referring URL

A sample message line is shown below:

144 apa s=0&ts=20100101010224&an_sig=0123456789abcdef0123456789abcdef 1.2.3.4 "-"