Back to all jobs
Data Analyst – Freelance & Remote Project

Data Analyst – Freelance & Remote Project

  • Freelancing
  • Remote
  • 2 November 2020
  • 1 position

Unfortunately, this offer is no longer available. More job offers !

Presentation of the Company

Our client is a video game development company based in Paris and Montreal.  It is a MMO space opera first-person RPG. In the video game, players can harvest materials by digging voxels, create buildings and ships, trade materials and other items with any player, explore planets and satellites, join communities and engage in spaceship combat. Every action performed by players happens in a persistent open world hosted on a single-shard shared server. Previously in alpha testing since 2019, the video game is now in open beta since August 2020.

Mission description

Current architecture : 

Our client generates high volumes of data for two reasons: players play in a large persistent open world formed by multiple planets, and the emergent gameplay allows a broad variety of playstyles. Moreover the development of the video game implies numerous technical challenges in rendering, balancing and design which requires precise analysis of data that must be complete, up-to-date and of high quality. As of now, data are generated by 3 main sources:

  • Client data ;
  • Server data ;
  • Web data.

1. Client data are generated and sent by the game running on players’ PCs. Here are the steps of the data pipeline between PC clients and the client’s data warehouse:

  • The client generates JSON files containing attributes of tracked gameplay events. Only a fraction of all events happening in the game engine are tracked ;
  • JSON files are sent by batches to an AWS S3 bucket every 10 minutes to avoid too frequent send queries ;
  • JSON files are cleaned and concatenated by a Python script ;
  • JSON files are sent to the Snowflake data warehouse using a Snowpipe process.

2. Server data is stored in multiple locations and comes from multiple sources.

Server data includes:

  • Gameplay events describing states of open-world components (transactions, voxels, characters skills) on a PostgreSQL server,
  • Time series of server health metrics (load balancing, queue time, concurrent users) stored in an Influxdb instance and visualized using Grafana
  • Server logs stored in an Elasticsearch instance and visualized using Kibana.

3. Web data includes:

  • Anonymized personal information of gamer profiles from the game website ;
  • Payment data form Xsolla platform ;
  • Marketing data from social networks and other platforms ;
  • Web data is stored on the same PostgreSQL server as server data.

Below is an outline of the current architecture.

Current issues :

The current use of data in analytics has some flaws that create efficiency issues:

  • No ETL tool is plugged into Snowflake. There is no reliable way to create, edit or schedule processes that transform data into the Snowflake data warehouse ;
  • The Snowflake data warehouse only features a fact table containing all client events ;
  • There is no dimension table or referential table yet ;
  • As a result, all analytics dashboards are based on client events, and were created during the phase when the game was in alpha.
  • Updating all analytics dashboards to beta phase would require time to fix all broken visualizations.

Main goals of the mission

The study will tackle these issues:

  • Choosing and deploying into production an ETL tool adapted to the Snowflake data warehouse, allowing the creation and maintenance of tables updated hourly or daily ;
  • The designed solution should have good speed and cost performance ;
  • Build a future-proof data platform by designing and implementing specifications sheets of dimension tables that would be populated by events from the client events fact table. The platform should be adapted to future release phases of the company’s game, not just the current open beta phase ;
  • Update the existing analytics to adapt it to the new data platform, fix broken views and help the Data team build new analytics dashboards.

Profile

  • Data Analyst with 1 to 5 years of experience
  • Technical stack: Snowflake, PostGreSQL, Python, Power BI, Tableau software, Qlick sense/view
  • Remote position
  • Proficient in English

Unfortunately, this offer is no longer available.