Forbes India Leader Recognition
e.preventDefault(); // Prevent form submission input.blur(); // remove focus input.value = ""; // optional: clear the input

AutoML Training Guide

AutoML Training Guide

Featured Image

Before you start

Note: Before using this script you need to export Google credentials.

Note: The dataset files should be having ratio as mentioned below:

Training Dataset (80%)

Test Dataset (10%)

Validate Dataset (10%)

For example: If you have 100 files then 80 should belong to the train dataset, 10 for Validation and 10 for the Test dataset.

Manual Annotation Process

Step 1: Create a directory say dir1 and add all the resumes that you want to use for the train dataset.

Step 2: Now you need to create dir2 and dir3 to validate and test the dataset respectively.

Step 3: To proceed further you need to install Google Cloud SDK and authenticate your email ID. Once you have completed the installation of Google Cloud SDK now you have to enter below commands one by one to upload train, validation and test dataset to Google Cloud Storage.

Train:

python2 script.py -t gs://match_making/tenant1/hr/documents/train train,dir1/*.pdf

Validation:

python2 script.py -t gs://match_making/tenant1/hr/documents/validation validation,dir2/*.pdf

Test:

python2 script.py -t gs://match_making/tenant1/hr/documents/test test,dir3/*.pdf

Step 4: To verify that the uploading is successfully completed. Navigate to the path that you have given in the script to verify your uploaded resumes/JDs in the Google Cloud Storage.

Step 5: After you are done with the data upload process, you need to import them by creating a new dataset in AutoML or you can use existing dataset.

Step 6: Navigate to the dataset.csv file in Google Cloud Storage and select the file and click on import.

Step 7: Now once you click on the import dataset, wait for 5-10 minutes to import all the resumes/JDs on the AutoML platform.

Step 8: Once done you will see all the PDF files and you can now annotate and start the annotation process.

Step 9: After completing the annotation you need to start training which usually takes 3 hours.

Step 10: Once training is finished you can test the model

Auto Annotation Process

Step 1: To upload txt files and use auto annotations feature you first need to convert PDF into a TXT file.

Step 2: Now you just need to keep these files in different directories like we have done for uploading PDF files and use the below commands to upload them to Google Cloud Storage.

Note: dict.csv file contains all the labels that we want to auto annotate.

Train:

python2 script.py -d dict.csv -s train,dir1/*.txt gs://match_making/tenant1/hr/documents/train

python2 script.py -d dict_banking.csv -s train,dir1/*.txt gs://match_making/tenant1/banking/documents/train

Validation:

python2 script.py -d dict.csv -s validation,dir2/*.txt gs://match_making/tenant1/hr/documents/validate

python2 script.py -d dict_banking.csv -s validation,dir2/*.txt gs://match_making/tenant1/banking/documents/validate

Test:

python2 script.py -d dict.csv -s test,dir3/*.txt gs://match_making/tenant1/hr/documents/test

python2 script.py -d dict_banking.csv -s test,dir3/*.txt gs://match_making/tenant1/banking/documents/test

Step 3: To verify that the uploading is successfully completed. Navigate to the path that you have given in the script to verify your uploaded resumes/JDs in Google Cloud Storage.

Step 4: After you are done with the data upload process, you need to import them by creating a new dataset in AutoML or you can use the existing dataset.

Step 5: Navigate to the dataset.csv file in Google Cloud Storage and select the file and click on import.

Step 6: Now once you click on the import dataset, wait for 5-10 minutes to import all the resumes/JDs on the AutoML platform.

Step 7: Once done you will see all the documents and you can now start the annotation process.

Step 8: After completing the annotation you need to start training which usually takes 3 hours.

Step 9: Once training is finished you can test the model.

Related Posts

Latest Posts

  • All Posts
  • AI Powered Knowledge
  • ai/ml
  • CEO India Magazine
  • CMMI level 5 Certification
  • e-learning
  • Fintech
  • gaming
  • Generative AI
  • healthcare
  • manufacturing
  • News
  • OTT
  • Portfolio
  • supply chain
  • travel and hospitality
  • Tudip's AI Hackathon
  • Voxlearn Enterprises
    •   Back
    • Android
    • iOS
    • Java
    • PHP
    • MEAN
    • Ruby
    • DotNet
    • IoT
    • Cloud
    • Testing
    • Roku
    • CMS
    • Python
The Future of Workplace Learning: AI-Powered Knowledge on Demand

The Future of Workplace Learning: AI-Powered Knowledge on Demand

June 12, 2026

A few months ago, I was sitting in a meeting with a team lead who looked genuinely frustrated. Not because…

Read More
We Built VoXlearn Because Enterprise Training Was Broken: Here’s What We Did About It

We Built VoXlearn Because Enterprise Training Was Broken: Here’s What We Did About It

June 12, 2026

If you’ve ever sat through an end-of-quarter training report and thought, “We spent all that time and money, and this…

Read More
We Did It Again: Tudip Successfully Renews Its CMMI Level 5 Certification

We Did It Again: Tudip Successfully Renews Its CMMI Level 5 Certification

June 9, 2026

Nobody around here needed a memo to know something worth celebrating had happened. The message from the CMMI Institute said…

Read More

India

Plot No. 11/2, Phase 3, Hinjewadi Rajiv Gandhi Infotech Park, Pune, India – 411057.
info@tudip.com
+91-96-8990-0537

United States

1999 S. Bascom Ave Suite 700, Campbell CA. 95008, USA.
info@tudip.com
+1-408-216-8162

Canada

64 Caracas Road North York, Toronto Ontario M2K 1B1, Canada.
info@tudip.com

Mexico

Calle Amado Nervo #785 Interior B Colonia Ladron De Guevara 44600 Guadalajara, Jalisco, Mexico.
info@tudip.com

Singapore

77 High Street, #10-12B High Street Plaza, Singapore 179433.
info@tudip.com

Colombia

Cra. 9 # 113-53 Of. 1405 Bogotá D.C., Colombia.
info@tudip.com

UAE

Tudip Information Technologies L.L.C Office No 109, ABU HAIL BUILDING 13, Abu Hail, Dubai, UAE.
info@tudip.com

Nigeria

22 Kumasi Crescent, Wuse 2, Abuja, Nigeria.
info@tudip.com