Consider the text of following documents: Document 1: Sahil likes to play cricket Document 2: Sajal likes cri…

Question

Consider the text of following documents:

Document 1: Sahil likes to play cricket
Document 2: Sajal likes cricket too
Document 3: Sajal also likes to play basketball

Apply all the four steps of Bag of words model of NLP on the above given documents and generate the output.

Accepted Answer

## Model Answer **Step 1: Collect Data (Documents)** - Doc 1: Sahil likes to play cricket - Doc 2: Sajal likes cricket too - Doc 3: Sajal also likes to play basketball **Step 2: Create a List of Unique Words (Vocabulary)** {sahil, likes, to, play, cricket, sajal, too, also, basketball} *(9 unique words)* **Step 3: Remove Stop Words (optional normalisation)** Stop words like "to", "too", "also" may be removed → Vocabulary: {sahil, likes, play, cricket, sajal, basketball} **Step 4: Create Document Vectors (Frequency Table)** | Word | Doc1 | Doc2 | Doc3 | |----------|------|------|------| | sahil | 1 | 0 | 0 | | likes | 1 | 1 | 1 | | play | 1 | 0 | 1 | | cricket | 1 | 1 | 0 | | sajal | 0 | 1 | 1 | | basketball | 0 | 0 | 1 | Each document is now represented as a numerical vector based on word frequency. *Source: Chapter 6, Bag of Words Model* --- ## Explanation - Examiners expect **all four steps clearly labelled**: data collection → vocabulary creation → stop word removal → document vect…

Artificial Intelligence — CBSE Class 10 board question