Data Schema

Data Schema are important to train an agent. A data schema defines which column of a dataset should be predicted, which ones are indepentent variable and which are switches that should be varied in order to minimze the cost function.

The data schema model

The agent model contains all the information about the agent.

Properties

  • Name
    SwitchColumns
    Type
    json
    Description

    The columns which switch states should be optimzed during inference.

  • Name
    IndependentColumns
    Type
    json
    Description

    The columns which might have an influence on the dependent variable.

  • Name
    DependentColumns
    Type
    json
    Description

    The columns which should be predicted.

  • Name
    HelperColumns
    Type
    datetime
    Description

    The columns which should not used during training, but might be used during inference.

Example Data Schema

{
  "SwitchColumns":       [
                              {"name": "hasAgofCampaign", "type":"binary", "options": {}, "switch_states": [0, 1]},
                              {"name": "hasIPTargetingCampaign", "type":"binary", "options": {}, "switch_states": [0, 1]},
                              {"name": "fcCappingCampaign", "type":"numerical", "options": {"log_transform": False, "min_value": [1], "max_value": [100], "epsilon": None}, "switch_states": [1, 5, 10, 20, 50]},
                              {"name": "fcCappingTime.1", "type":"numerical", "options": {"log_transform": True, "min_value": [10], "max_value": [14], "epsilon": [1]}, "switch_states": [1, 1*60, 1*60*30, 1*60*60, 4*60*60, 8*60*60, 24*60*60, 2*24*60*60,  8*24*60*60,  14*24*60*60]},
                              {"name": "maxCpm", "type": "numerical", "options": {"log_transform": False, "min_value": [0], "max_value": [1200], "epsilon": None}, "switch_states": [175, 1190,  350,  275,  276,  770,  330,  300, 1200,  340,  400, 375,  208,  250,  210, 500]},
                              {"name": "biddingStrategy", "type": "categorial", "options": {"categories": ["INTERVAL_BUDGET_OPTIMIZED", "MAXPRICE", "FLOORPRICE"]}, "switch_states": ["INTERVAL_BUDGET_OPTIMIZED", "MAXPRICE"]},
                              {"name": "deliveryTechnique", "type": "categorial", "options": {"categories": ["SMOOTH", "SIMPLE"]}, "switch_states": ["SMOOTH", "SIMPLE"]},
                              {"name": "hashAgofListCampaign", "type": "categorial", "options": {"categories": ["78dba1dda34d66a63363f3071c3bc542ec563f49ba72d1055f7f631f7267c8f0", "e448808417a4f48e3db3d77a8a316fd9f4c186976756f928cf9f92bc9e89cb3b", "2f363fddccdfbea43dc2760033caa4c1890a1f6cd8c7914abff51ec0bca5be44", "bd0b550608481e7e2cf7bb790589bff15543e5728bc067cb378968e407962ab2", "bab53f6a9302a236646f22c09584d543be48b828537251829e8dbbd14875d590", "1b1f52b15a73dcc3dd91f9512d623a8d740d53a994813dc74c4ce1b8157c90c6"]}, "switch_states": ["78dba1dda34d66a63363f3071c3bc542ec563f49ba72d1055f7f631f7267c8f0", "1b1f52b15a73dcc3dd91f9512d623a8d740d53a994813dc74c4ce1b8157c90c6"]},
                          ],
  "IndependentColumns":  [
                              {"name": "size", "type": "categorial", "options": {"categories": ['90x728', '600x160', '250x300', '250x970', '50x300', '75x300','50x20']}},
                              {"name": "hour", "type": "categorial", "options": {"categories": [ 5,  8, 11, 14, 17, 20, 23,  2]}},
                              {"name": "dayofweek", "type": "categorial",  "options": {"categories": [6, 0, 1, 2, 3, 4, 5]}},
                              {"name": "month", "type": "categorial", "options": {"categories": [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12]}},
                              {"name": "totalBudget", "type": "numerical", "options": {"log_transform": True, "min_value": [6], "max_value": [14], "epsilon": [1]}},
                              {"name": "totalImpressions", "type": "numerical", "options": {"log_transform": True, "min_value": [7], "max_value": [14], "epsilon": [1]}},
                          ],
  "DependentColumns":     [
                              {"name": "Clicks", "type": "numerical", "options": {"log_transform": True, "min_value": [0], "max_value": [5], "epsilon": [1]}},
                              {"name": "Impressions", "type": "numerical","options": {"log_transform": True, "min_value": [0], "max_value": [12], "epsilon": [1]}},
                              {"name": "Costs", "type": "numerical", "options": {"log_transform": True, "min_value": [-12], "max_value": [3], "epsilon": [0.00001]}},
                          ],
  "HelperColumns":        [
                              {"name": "targetImpressions", "type": "numerical", "options": {"scale": False, "log_transform": False, "min_value": [0], "max_value": [12], "epsilon": [1]}},
                          ]
  }                          

The column model

The column class represent a column in a dataset. The class covers information about data type and normalization, as well as switch states.

Properties

  • Name
    name
    Type
    json
    Description

    The name of the column.

  • Name
    type
    Type
    json
    Description

    The data type of the column, either binary, categorical or numerical.

  • Name
    options
    Type
    json
    Description

    Normalization data for numerical and categorical columns (see example above).

Optional Properties

  • Name
    switch_states
    Type
    array
    Description

    The state which are allowed the varry during inference.

Example Column

{
  "name": "size", 
  "type": "categorial", 
  "options": {
    "categories": ['90x728', '600x160', '250x300', '250x970', '50x300', '75x300','50x20']
    }
}