Home ยป Taming the Data Mess: A Guide to Naming Conventions for Data Collection

Taming the Data Mess: A Guide to Naming Conventions for Data Collection

A Guide to Naming Conventions for Data Collection

In data collection, inconsistent and confusing data labels can be a major pain point, hindering data analysis and causing unnecessary delays. This data mess often arises from developers using different terms to describe the same user interaction, leading to a tangled web of data that’s difficult to decipher.

Like technical debt, messy data can accumulate over time, becoming increasingly difficult to manage. To address this issue, it’s crucial to establish a naming convention framework for your product analytics strategy, ensuring consistency and clarity in data collection.

Advertisement

What is a Naming Convention?

A naming convention is a set of rules for labeling data. It serves as a guide for identifying and categorizing events within a product or service. Typically, this framework is documented in a PDF, spreadsheet, or section of your technical documentation website.

A comprehensive naming convention should outline the guidelines for filling in the event schema and creating data labels. It should include plenty of examples and address edge cases or unusual scenarios to facilitate easy understanding. By adhering to these rules, your team can create consistent and clear data labels for new events, streamlining data analysis and decision-making.

Establishing a Naming Convention

If you’re unsure what data to track or your data is already in disarray, the first step is to develop a tracking plan. This plan will help identify patterns and rules that can form the basis of your naming conventions. Like journaling brings clarity to thoughts, creating a tracking plan will illuminate the events to track and the naming conventions that suit your product.

Creating a Tracking Plan:

  1. Setup: Choose a workspace like a spreadsheet, Miro board, paper, or whiteboard.
  2. User Journey: Describe a typical user journey in your product, such as signing up or purchasing.
  3. User Steps: Break down the user journey into individual steps, which will become event data points.
  4. Event Analysis: For each step/event, answer the following questions:
    • Event Context: What happened in this step?
    • Actor Identification: Who triggered this action? (User or system?)
    • Details Documentation: What was created or updated? Important details about this step?
    • Event Schema: If using an event schema, translate these questions into schema fields.

Object+Action Framework

The object+action framework is a recommended approach for naming events. It simplifies the process of identifying the object a user interacts with and the action taken. This framework is widely adopted by product analytics tools, including Mini Digital, as a fundamental convention for event names.

  1. Object: Identify the object (noun) the user or system interacts with. Examples include buttons, screens, pages, data entities like accounts, wallets, blogs, images, or products.
  2. Action: Choose an action verb (past tense) to describe the user/system’s interaction with the object. Examples include created, clicked, selected, viewed, or connected.
  3. Event Name: Combine the object and action to form the event name, such as walletCreated or walletConnected. This defines the eventName property.

Avoiding Common Pitfalls

Advertisement
  1. Case Sensitivity: Use a consistent case format for data labels. Consider adopting the case already used in your codebase for consistency.
  2. Acronyms: Avoid abbreviations unless widely recognized, like NFT (Non-Fungible-Token). Consider using full words or creating additional event properties for acronyms.
  3. Personally Identifiable Information: Avoid embedding project, team, personal names, or personally identifiable information in data labels. Use event properties for such details.
  4. Numbers, Codes, Dates, and IDs: Relegate numbers, codes, and IDs to event properties, providing a data label for context.
  5. Multiple Languages: Choose a single language for data collection to avoid inconsistencies. Establish clear rules for language usage in naming conventions.
  6. State of Codebase or Project: Avoid including codebase, product, or project state information in event names. Use event properties for this purpose.
  7. Prefixing and Suffixing: Use prefixes and suffixes to indicate state or timing, but avoid subjective words like “new”, “old”, or “final”. Instead, consider the following alternatives:
Original WordAlternative Words
newinitiated, created, added, started
oldexisting, previously_seen, visited_before
firstfirst_time, initial, unseen
lastfinal, completed, ended
finalcompleted, resolved, closed

Conclusion

Establishing a naming convention framework for data collection is essential for maintaining consistency, clarity, and efficiency in data analysis. By following these guidelines, you can ensure your data is organized, easily understandable, and ready to be transformed into actionable insights.