Output Destinations

You can create a new output destination or select an already created one to save your model outputs that TAZI Hunt creates when it runs the model with the configurations that you have set up in Configure Business Model step.

You can choose to write model outputs to a file, to a database, Kafka, GCS, S3, HDFS or HTTP that is configured by you.

If File is chosen as Output Type you will be directed to Output Options step.

You have two extension options here as json and parquet. If you choose json, TAZI will write outputs to a single json file. If the parquet option is choosen, you will see a screen like below.

Here, you need to specify your path format and send count. There are various templates for you to choose as a path format. Or you can also write your own format. Send Count means how many rows you want to have in each output file. For example, if you set 1000 as your send count, TAZI will create an output for every 1000 rows.

Some keywords you can use:

{year} = Example output: 2021
{MM} = Example output: 10
{dd} = Example output: 05
{HH} = Example output: 13
{EPOCH} = Example output: 1640245620
{TIMESTAMP} = Example output: 2021-12-23-07-45-34-158
{RAND} = Universally unique identifier (UUID). Example output: 123e4567-e89b-12d3-a456-426614174000

Also, you need to specify the dir parameter. For example, you want to save your outputs to /var/output destination. An example for your path could be like this: /var/output/{EPOCH}.parquet

And, if you choose the following format: /var/output/mymodel/{yyyy}/{MM}/{dd}/{HH}/file.{EPOCH}.parquet, it will generate the files based on the variables in curly braces. For example, the file named file.1640252780.parquet in /var/output/mymodel/2021/12/23/12/ directory will be created.

Note: Make sure TAZI is allowed to write to your destination folder. Otherwise you will get an error since TAZI won't be able to create your output files. Also, if there is no folder as you specified in the path, TAZI will create this destination path for you.

If DB is chosen as output type, Output Options step will be enabled for in which you will set up a new connection to a database for your output or select an existing one:

If Kafka is chosen, then, you can create a new configuration for Kafka or select a one you have created before:

Additionally, you can choose other connection options which are GCS, S3, and HDFS. You will have two extension types here as json and parquet. You can follow the same steps from file outputs when you choose these output options.

After selecting the output destination for your model outputs, now you can proceed to overview your model:

Now you can give a description to your Business Model, edit the name that is given by default and assign a Group for authorization. When you finally hit the Submit button, you can start the training phase of your model with all of the configurations that have been set up previously.