📚 CBSE Grade-10 Study Guide Open in the Study Guide single page app →
CBSE Class X
Artificial Intelligence
Question Paper
From previous CBSE Board Exam questions
Code: 8Z6EEIQuestions: 53Maximum Marks: 90Generated: 2026-06-21 02:53
Selections used
SourcePrevious-year board
SubjectArtificial Intelligence
LessonsNatural Language Processing
Questions selected53
If a question refers to an image, map, graph or diagram that is not shown here, open the Study Guide single page app, go to Library and find the actual CBSE question paper. The original papers are also available on the CBSE website: cbse.gov.in.
Q1. [4]
Document 1 : Data Science requires information. Document 2 : Information analysis requires data.
Implement all the four steps of Bag of Words (BoW) model to create a document vector table.
Previously asked in: 2026 104 Q21
Q2. [4]
Consider the following scenarios and identify which AI domain would be most appropriate for each, with justification:
  1. (A) An AI based education platform needs to translate to English language and analyze thousands of student essays to provide instant feedback on grammar, content quality and writing style. [2]
  2. (B) An AI based application installed on a busy crossing in a metropolitan city scans all vehicles driving through that crossing during peak traffic hours and categorizes them into four wheelers and two wheelers. [2]
Previously asked in: 2026 104 Q18
Q3. [2]
How is Stemming different from Lemmatization ? Explain how the word "Wolves" would be processed by stemming and lemmatization.
Previously asked in: 2026 104 Q16
Q4. [1]
A company wants to analyze customer reviews to understand satisfaction levels. Which NLP application would be most suitable ?
  1. (A) Text classification
  2. (B) Sentiment analysis
  3. (C) Keyword extraction
  4. (D) Language translation
Previously asked in: 2026 104 Q5 (v)
Q5. [1]
Assertion (A) : Converting text to lowercase is preferable in text preprocessing. Reason (R) : It ensures that "Hello" and "hello" are treated as the same word by the machine.
  1. (A) Both (A) and (R) are true and (R) is the correct explanation of (A).
  2. (B) Both (A) and (R) are true, but (R) is not the correct explanation of (A).
  3. (C) (A) is true, but (R) is false.
  4. (D) (A) is false, but (R) is true.
Previously asked in: 2026 104 Q5 (ii)
Q6. [1]
Which type of chat bot requires coding and works on bigger databases directly ?
  1. (A) Script bot
  2. (B) Smart bot
  3. (C) Traditional bot
  4. (D) Rule-based bot
Previously asked in: 2026 104 Q4 (vi)
Q7. [1]
Which NLP application helps in converting natural speech into text in real time ?
  1. (A) Keyword Extraction tool
  2. (B) Translation of books from English to Hindi language
  3. (C) Auto generated captions on YouTube
  4. (D) Classifying raw text into pre-defined groups
Previously asked in: 2026 104 Q3 (v)
Q8. [1]
Consider the following sentence: On seeing her son's result, Pooja's face turned red with anger. The word "red" demonstrates which characteristic of natural language ?
  1. (A) Redundancy
  2. (B) Context-dependent meaning
  3. (C) Grammatical structure
  4. (D) Temporal change
Previously asked in: 2026 104 Q2 (ii)
Q9. [4]
Categorise the following examples under the given three AI domains — Data Science, NLP and Computer Vision with justification :
  1. (a) Recommendation Websites
  2. (b) Voice-based Virtual Assistants
  3. (c) Spam Filters
  4. (d) Airline Route Planning
Previously asked in: 2025 104/S Q20
Q10. [4]
Document 1 : CV is an upcoming field. Document 2 : Image Feature is an important part of CV.
You have two documents : Document 1 : CV is an upcoming field. Document 2 : Image Feature is an important part of CV. Implement all four steps of the Bag of Words (BoW) model to create a document vector table. Depict the outcome of each step.
Previously asked in: 2025 104/S Q19
Q11. [1]
Sentiment analysis of customer reviews on various online stores is an example of ____________.
  1. (A) Machine Learning
  2. (B) Computer Vision
  3. (C) Natural Language Processing (NLP)
  4. (D) Speech Recognition
Previously asked in: 2025 104/S Q5 (i)
Q12. [1]
The first step of Bag of Words algorithm is Text Normalisation. Which of the following task is done in this step?
  1. (A) Creating document vectors
  2. (B) Collecting and pre-processing data
  3. (C) Adding the words to a dictionary
  4. (D) Creating a vector of words
Previously asked in: 2025 104/S Q4 (iv)
Q13. [1]
In Natural Language Processing (NLP), ___________ occur/s very frequently in the corpus but do/does not add any value to it.
  1. (A) Text Normalisation
  2. (B) Stop words
  3. (C) Start words
  4. (D) Tokenisation
Previously asked in: 2025 104/S Q4 (i)
Q14. [1]
In the sentence 'She reads the book', which of the following is a stop-word that should be removed during text preprocessing?
  1. (A) She
  2. (B) reads
  3. (C) the
  4. (D) book
Previously asked in: 2025 104/S Q3 (vi)
Q15. [1]
Which of the following words represents an example of stemming for the word 'Sharing'?
  1. (A) Share
  2. (B) Shared
  3. (C) Shares
  4. (D) Shar
Previously asked in: 2025 104/S Q2 (iv)
Q16. [4]
Document 1 : NLP is a domain of AI. Document 2 : NLP stands for Natural Language Processing.
Consider the following documents : Implement all the four steps of Bag of Words (BoW) model to create a document vector table.
Previously asked in: 2024 104 Q20
Q17. [2]
What is the primary difference between Human Language and Computer Language ?
Previously asked in: 2024 104 Q15
Q18. [2]
Differentiate between Computer Vision (CV) and Natural Language Processing (NLP).
Previously asked in: 2024 104 Q11
Q19. [1]
Which type of chat-bot has a wide functionality, is flexible and powerful, and works on bigger databases directly ?
Previously asked in: 2024 104 Q5 (vi)
Q20. [1]
In the context of NLP, which of the following words represents a stem resulting from stemming for "Studies" ?
  1. (A) Study
  2. (B) Stud
  3. (C) Studi
  4. (D) Studied
Previously asked in: 2024 104 Q5 (iv)
Q21. [1]
Which of the following applications of NLP (Natural Language Processing) is associated with spam filtering in e-mails ?
  1. (A) Virtual Assistants
  2. (B) Sentiment Analysis
  3. (C) Text Classification
  4. (D) Automatic Summarization
Previously asked in: 2024 104 Q4 (v)
Q22. [1]
Which application of NLP helps to provide an overview of a news item or blog post ? It also avoids redundancy from multiple sources and maximises the diversity of content obtained.
  1. (A) Virtual Assistants
  2. (B) Sentiment Analysis
  3. (C) Text Classification
  4. (D) Automatic Summarization
Previously asked in: 2024 104 Q3 (v)
Q23. [1]
It is a domain-specific language that is designed for managing data held in different kinds of DBMS (Database Management System). It is particularly useful in handling structured data. Which computer language is this ?
  1. (A) SQL
  2. (B) CSV
  3. (C) Spreadsheet
  4. (D) TXT
Previously asked in: 2024 104 Q3 (iv)
Q24. [1]
A corpus contains 4 documents in which the words such as 'an, is, the' were appearing frequently. Identify the term that is used for such words.
  1. (A) Stop word
  2. (B) Rare word
  3. (C) Missing word
  4. (D) Removable word
Previously asked in: 2024 104 Q2 (vi)
Q25. [1]
Spam refers to
  1. (A) Unnecessary images
  2. (B) Temporary files
  3. (C) Junk mails
  4. (D) Music files
Previously asked in: 2024 104 Q1 (ii)
Q26. [4]
Consider the following two documents : Document 1 : ML and DL are part of AI. Document 2 : DL is a subset of ML. Implement all four steps of the Bag of Words (BoW) model to create a document vector table. Depict the outcome of each step.
Previously asked in: 2024 104 Q19
Q27. [2]
What are the primary differences between Script-bots and Smart-bots ?
Previously asked in: 2024 104 Q12
Q28. [1]
Which domain of AI is used for interacting with virtual assistants such as Siri and Alexa ?
  1. (a) Machine Learning (ML)
  2. (b) Computer Vision (CV)
  3. (c) Natural Language Processing (NLP)
  4. (d) Technical Vision (TV)
Previously asked in: 2024 104 Q5 (vi)
Q29. [1]
Bag of Words is a ________ model which helps in extracting features out of the text which can be helpful in machine learning algorithms.
  1. (a) Data Science (DS)
  2. (b) Virtual Reality (VR)
  3. (c) Natural Language Processing (NLP)
  4. (d) Computer Vision (CV)
Previously asked in: 2024 104 Q4 (iii)
Q30. [1]
Which of the following applications is not associated with Natural Language Processing (NLP) ?
  1. (a) Sentiment Analysis
  2. (b) Speech Recognition
  3. (c) Spam Filtering in emails
  4. (d) Stock Market Analysis
Previously asked in: 2024 104 Q3 (vi)
Q31. [1]
Which of the following words represent an example of a lemma resulting from lemmatisation for "caring" in context to Natural Language Processing (NLP) ?
  1. (a) Care
  2. (b) Cared
  3. (c) Cares
  4. (d) Car
Previously asked in: 2024 104 Q2 (iv)
Q32. [1]
This real life application of NLP is used to provide an overview of a news item or blog post, while avoiding redundancy from multiple sources and maximising the diversity of content obtained. Which is this application ?
  1. (a) Chatbot
  2. (b) Virtual Assistant
  3. (c) Sentiment Analysis
  4. (d) Automatic Summarisation
Previously asked in: 2024 104 Q2 (ii)
Q33. [4]
Create a document vector table from the following documents by implementing all the four steps of Bag of words model. Also depict the outcome of each step. Document 1: Sameera and Sanya are classmates. Document 2: Sameera likes dancing but Sanya loves to study mathematics.
Previously asked in: 2023 104 Q19
Q34. [2]
With reference to data processing, expand the term TFIDF. Also give any two applications of TFIDF.
Previously asked in: 2023 104 Q16
Q35. [2]
Define Chatbot. What are its types?
Previously asked in: 2023 104 Q13
Q36. [1]
Smart Assistants such as Alexa, Siri are the examples of:
  1. (a) Natural Language Processing
  2. (b) Data Science
  3. (c) Machine Learning
  4. (d) Computer Vision
Previously asked in: 2023 104 Q5 (iii)
Q37. [1]
______ is a term used for any word or number or special character occurring in a sentence. (Token / Punctuator)
Previously asked in: 2023 104 Q5 (i)
Q38. [1]
Which of the following is a feature of document classification?
  1. (a) Helps in classifying the type and genre of a document.
  2. (b) Helps in creating a document.
  3. (c) Helps to display important information of a corpus.
  4. (d) Helps in including the necessary words in the text body.
Previously asked in: 2023 104 Q4 (i)
Q39. [1]
With reference to NLP, consider the following plot of occurrence of words versus their value: In the given graph, X represents:
  1. (a) Rare / valuable words
  2. (b) Punctuation words
  3. (c) Popular words
  4. (d) Pronoun
Previously asked in: 2023 104 Q3 (vi)
Q40. [1]
For ______ the whole corpus is divided into sentences. Each sentence is taken as a different data so now the whole corpus gets reduced to sentences.
  1. (a) Text Regulation
  2. (b) Sentence Segmentation
  3. (c) Tokenisation
  4. (d) Stemming
Previously asked in: 2023 104 Q3 (iii)
Q41. [1]
Select the correct features of Smart Bot:
  1. (a) Smart-bots are flexible and powerful
  2. (b) Coding is required to take this up on board
  3. (c) Smart bots work on bigger databases and other resources directly
  4. (d) All of the above
Previously asked in: 2023 104 Q3 (ii)
Q42. [1]
Email filters, spam filters, smart assistants are the examples of:
  1. (a) Pocket Assistants
  2. (b) CV
  3. (c) NLP
  4. (d) Evaluation
Previously asked in: 2023 104 Q3 (i)
Q43. [1]
Two popular examples of pocket assistants are _____ and _____.
Previously asked in: 2023 104 Q2 (i)
Q44. [4]
With reference to NLP, explain the following terms in detail with the help of suitable example: • Term frequency • Inverse Document Frequency
Previously asked in: 2022 104 Q20
Q45. [4]
Consider the text of following documents: Document 1: Sahil likes to play cricket Document 2: Sajal likes cricket too Document 3: Sajal also likes to play basketball Apply all the four steps of Bag of words model of NLP on the above given documents and generate the output.
Previously asked in: 2022 104 Q19
Q46. [2]
Explain the following picture which depicts one of the processes on NLP. Also mention the purpose which will be achieved by this process.
Previously asked in: 2022 104 Q18
Q47. [2]
Kaira, a beginner in the field of NLP is trying to understand the process of Stemming. Help her in filling up the following table by suggesting appropriate affixes and stem of the words mentioned there:
Previously asked in: 2022 104 Q16
Q48. [2]
What is Tokenization? Count how many tokens are present in the following statement: I find that the harder I work, the more luck I seem to have.
Previously asked in: 2022 104 Q15
Q49. [2]
Differentiate between Script-bot and Smart-bot.
Previously asked in: 2025 104/S Q12; 2022 104 Q13 —
Q50. [1]
Name the process of dividing whole corpus into sentences.
Previously asked in: 2022 104 Q10
Q51. [1]
Name any two currently popular virtual assistants.
Previously asked in: 2022 104 Q9
Q52. [1]
Mention any two commonly used applications of NLP.
Previously asked in: 2022 104 Q8
Q53. [1]
What is NLP?
Previously asked in: 2022 104 Q7
CBSE Class X
Artificial Intelligence
Answer Key
From previous CBSE Board Exam questions
Code: 8Z6EEIQuestions: 53Maximum Marks: 90Generated: 2026-06-21 02:53
Q1. [4]
Document 1 : Data Science requires information. Document 2 : Information analysis requires data.
Implement all the four steps of Bag of Words (BoW) model to create a document vector table.
Previously asked in: 2026 104 Q21
Generated by claude-sonnet-4-6 · 2026-06-21 03:20 · grounding stimulus
Model Answer

Step 1 – Collect Documents:

Step 2 – Create Vocabulary (unique words, lowercase):
{data, science, requires, information, analysis}

Step 3 – Create Word Frequency Vectors:

| | data | science | requires | information | analysis |
|---|---|---|---|---|---|
| Doc 1 | 1 | 1 | 1 | 1 | 0 |
| Doc 2 | 1 | 0 | 1 | 1 | 1 |

Step 4 – Document Vector Table ready.
Each document is represented as a numeric vector based on word frequency.

Source: Bag of Words Model, Document Representation

---

Explanation
Q2. [4]
Consider the following scenarios and identify which AI domain would be most appropriate for each, with justification:
  1. (A) An AI based education platform needs to translate to English language and analyze thousands of student essays to provide instant feedback on grammar, content quality and writing style. [2]
  2. (B) An AI based application installed on a busy crossing in a metropolitan city scans all vehicles driving through that crossing during peak traffic hours and categorizes them into four wheelers and two wheelers. [2]
Previously asked in: 2026 104 Q18
Generated by claude-sonnet-4-6 · 2026-06-21 03:19 · grounding rag
Model Answer

(A) Natural Language Processing (NLP)

The most appropriate AI domain is NLP. The platform needs to translate essays into English and then analyze grammar, content quality, and writing style — all of which involve understanding and processing human natural language. NLP algorithms extract meaning from written text, making it ideal for grammar checking, feedback generation, and language translation tasks.

(B) Computer Vision (CV)

The most appropriate AI domain is Computer Vision. The application scans vehicles (visual data from a busy crossing) and categorizes them into four-wheelers and two-wheelers. CV enables machines to acquire, analyse, and interpret visual information from images or video feeds, making it perfect for vehicle detection and classification tasks.

Source: Chapter 1, Section 1.2 — Introduction to AI Domains

---

Explanation
Q3. [2]
How is Stemming different from Lemmatization ? Explain how the word "Wolves" would be processed by stemming and lemmatization.
Previously asked in: 2026 104 Q16
Generated by claude-sonnet-4-6 · 2026-06-21 03:20 · grounding rag
Model Answer

Stemming removes affixes from words to get the root/base form, but the result may not be a meaningful word. Lemmatization also removes affixes, but always produces a meaningful word (called a lemma).

Processing "Wolves":

Source: Chapter 6, Section 6.5 – Text Processing (Stemming & Lemmatization)

---

Explanation
Q4. [1]
A company wants to analyze customer reviews to understand satisfaction levels. Which NLP application would be most suitable ?
  1. (A) Text classification
  2. (B) Sentiment analysis
  3. (C) Keyword extraction
  4. (D) Language translation
Previously asked in: 2026 104 Q5 (v)
Generated by claude-sonnet-4-6 · 2026-06-21 03:20 · grounding rag
Model Answer

(B) Sentiment Analysis — It analyzes customer reviews to determine whether opinions are positive, negative, or neutral, making it the most suitable for understanding customer satisfaction levels.

Source: Applications of Natural Language Processing, Chapter 6

Explanation

Examiners expect you to directly choose (B) Sentiment Analysis and justify it briefly in one line. Remember: Sentiment Analysis specifically detects emotions/opinions in text (positive/negative/neutral), which directly maps to "satisfaction levels." Text classification categorizes documents, keyword extraction finds key terms, and language translation converts languages — none specifically measure satisfaction.

Q5. [1]
Assertion (A) : Converting text to lowercase is preferable in text preprocessing. Reason (R) : It ensures that "Hello" and "hello" are treated as the same word by the machine.
  1. (A) Both (A) and (R) are true and (R) is the correct explanation of (A).
  2. (B) Both (A) and (R) are true, but (R) is not the correct explanation of (A).
  3. (C) (A) is true, but (R) is false.
  4. (D) (A) is false, but (R) is true.
Previously asked in: 2026 104 Q5 (ii)
Generated by claude-sonnet-4-6 · 2026-06-21 03:20 · grounding rag
Model Answer

(A) Both (A) and (R) are true and (R) is the correct explanation of (A).

Converting text to lowercase is a key preprocessing step, and the reason correctly explains that it prevents the machine from treating "Hello" and "hello" as different words.

Source: Chapter 6, Section 6.5 – Text Processing (Converting Text to a Common Case)

---

Explanation

The textbook explicitly states: "we convert the whole text into a similar case, preferably lowercase. This ensures that the case sensitivity of the machine does not consider the same words as different just because of different cases." The Reason directly and correctly explains the Assertion, so option (A) is the right choice. In Assertion-Reason questions, always check if the Reason is a direct cause of the Assertion — here it is.

Q6. [1]
Which type of chat bot requires coding and works on bigger databases directly ?
  1. (A) Script bot
  2. (B) Smart bot
  3. (C) Traditional bot
  4. (D) Rule-based bot
Previously asked in: 2026 104 Q4 (vi)
Generated by claude-sonnet-4-6 · 2026-06-21 03:20 · grounding rag
Model Answer

(B) Smart bot

A Smart bot requires coding and works on bigger databases directly.

Explanation

The source passage distinguishes two types of chatbots: Script-bot (scripted/traditional, rule-based, simpler) and Smart-bot (AI-powered, works with larger databases, requires coding). The question describes characteristics of a Smart bot. Options C and D are not the two types named in the passage; option A (Script bot) is the simpler, non-coding type.

Q7. [1]
Which NLP application helps in converting natural speech into text in real time ?
  1. (A) Keyword Extraction tool
  2. (B) Translation of books from English to Hindi language
  3. (C) Auto generated captions on YouTube
  4. (D) Classifying raw text into pre-defined groups
Previously asked in: 2026 104 Q3 (v)
Generated by claude-sonnet-4-6 · 2026-06-21 03:20 · grounding rag
Model Answer

(C) Auto generated captions on YouTube

Auto-generated captions convert natural speech into text in real time, making video content more accessible.

Explanation

The passage explicitly states: "Captions are generated by turning natural speech into text in real-time" and gives YouTube as an example. Other options describe different NLP applications — keyword extraction, language translation, and text classification respectively.

Q8. [1]
Consider the following sentence: On seeing her son's result, Pooja's face turned red with anger. The word "red" demonstrates which characteristic of natural language ?
  1. (A) Redundancy
  2. (B) Context-dependent meaning
  3. (C) Grammatical structure
  4. (D) Temporal change
Previously asked in: 2026 104 Q2 (ii)
Generated by claude-sonnet-4-6 · 2026-06-21 03:20 · grounding rag
Model Answer

(B) Context-dependent meaning

The word "red" here means anger, not a colour. Its meaning changes based on context, which is a key characteristic of natural language.

Explanation

The source passage directly uses a similar example — "His face turned red after he found out..." — to illustrate that the word "red" can mean shame, anger, or a physical reaction depending on context. Examiners expect students to identify this as context-dependent meaning. The other options (redundancy, grammatical structure, temporal change) are also features of natural language but do not apply here.

Q9. [4]
Categorise the following examples under the given three AI domains — Data Science, NLP and Computer Vision with justification :
  1. (a) Recommendation Websites
  2. (b) Voice-based Virtual Assistants
  3. (c) Spam Filters
  4. (d) Airline Route Planning
Previously asked in: 2025 104/S Q20
Generated by claude-sonnet-4-6 · 2026-06-21 03:19 · grounding rag
Model Answer

| Example | Domain | Justification |
|---|---|---|
| (a) Recommendation Websites | Data Science (Statistical Data) | They collect and analyse large amounts of user data to suggest products/content, deriving meaningful insights from datasets. |
| (b) Voice-based Virtual Assistants | NLP | They interact with humans using natural spoken language; NLP algorithms decode and respond to human speech. |
| (c) Spam Filters | NLP | They detect certain words/phrases in emails to identify spam — one of the earliest NLP applications. |
| (d) Airline Route Planning | Data Science (Statistical Data) | They analyse large statistical datasets (weather, fuel, traffic) to extract insights and optimise flight routes. |

Source: Chapter 1, Section 1.2 — Introduction to AI Domains

---

Explanation
Q10. [4]
Document 1 : CV is an upcoming field. Document 2 : Image Feature is an important part of CV.
You have two documents : Document 1 : CV is an upcoming field. Document 2 : Image Feature is an important part of CV. Implement all four steps of the Bag of Words (BoW) model to create a document vector table. Depict the outcome of each step.
Previously asked in: 2025 104/S Q19
Generated by claude-sonnet-4-6 · 2026-06-21 03:20 · grounding stimulus
Model Answer

Step 1 – Collect Training Data (Documents):

Step 2 – Design the Vocabulary (unique words):
{ CV, is, an, upcoming, field, Image, Feature, important, part, of }
(Stop words retained; total = 10 unique words)

Step 3 – Create Document Vectors (word frequency count):

| Word | CV | is | an | upcoming | field | Image | Feature | important | part | of |
|---|---|---|---|---|---|---|---|---|---|---|
| Doc 1 | 1 | 1 | 1 | 1 | 1 | 0 | 0 | 0 | 0 | 0 |
| Doc 2 | 1 | 1 | 1 | 0 | 0 | 1 | 1 | 1 | 1 | 1 |

Step 4 – Use Vectors for ML Model:
Doc 1 → [1,1,1,1,1,0,0,0,0,0]
Doc 2 → [1,1,1,0,0,1,1,1,1,1]

These numerical vectors are fed into a Machine Learning model for text classification or analysis.

---

Explanation
Q11. [1]
Sentiment analysis of customer reviews on various online stores is an example of ____________.
  1. (A) Machine Learning
  2. (B) Computer Vision
  3. (C) Natural Language Processing (NLP)
  4. (D) Speech Recognition
Previously asked in: 2025 104/S Q5 (i)
Generated by claude-sonnet-4-6 · 2026-06-21 03:19 · grounding rag
Model Answer

(C) Natural Language Processing (NLP)

Sentiment analysis of customer reviews involves analysing textual data to detect positive, negative, or neutral opinions — which is an application of NLP.

Source: Chapter 6, Section 6.2 Applications of Natural Language Processing

---

Explanation
Q12. [1]
The first step of Bag of Words algorithm is Text Normalisation. Which of the following task is done in this step?
  1. (A) Creating document vectors
  2. (B) Collecting and pre-processing data
  3. (C) Adding the words to a dictionary
  4. (D) Creating a vector of words
Previously asked in: 2025 104/S Q4 (iv)
Generated by claude-sonnet-4-6 · 2026-06-21 03:19 · grounding rag
Model Answer

(B) Collecting and pre-processing data

In the Bag of Words algorithm, the first step is Text Processing, which involves collecting data and pre-processing it (text normalisation).

Explanation

The source passage explicitly lists the steps of Bag of Words: Step 1 is "Collecting data and pre-processing it," Step 2 is creating a dictionary, and Step 3–4 involve creating document vectors. Examiners expect you to directly identify option (B) as the answer. Do not confuse "Text Normalisation" (the broader NLP concept) with the first step of the BoW algorithm, which is data collection and pre-processing.

Q13. [1]
In Natural Language Processing (NLP), ___________ occur/s very frequently in the corpus but do/does not add any value to it.
  1. (A) Text Normalisation
  2. (B) Stop words
  3. (C) Start words
  4. (D) Tokenisation
Previously asked in: 2025 104/S Q4 (i)
Generated by claude-sonnet-4-6 · 2026-06-21 03:19 · grounding rag
Model Answer

(B) Stop words

Stop words occur very frequently in the corpus (e.g., "is", "the", "and") but do not add any meaningful value to it.

Explanation

The source passage (Test Yourself, Q.7) defines stop words as "words with negligible value that are often removed during preprocessing." They appear frequently but carry no significant meaning, making (B) the correct answer. Text Normalisation and Tokenisation are processes, not word types; "Start words" is not an NLP term.

Q14. [1]
In the sentence 'She reads the book', which of the following is a stop-word that should be removed during text preprocessing?
  1. (A) She
  2. (B) reads
  3. (C) the
  4. (D) book
Previously asked in: 2025 104/S Q3 (vi)
Generated by claude-sonnet-4-6 · 2026-06-21 03:19 · grounding rag
Model Answer

(C) the

"The" is a stop word — a common word with negligible meaning that is removed during text preprocessing, as it adds no significant value to text analysis.

Explanation

Stop words are high-frequency words (like the, is, a, an, of) that carry little meaningful information and are removed during text normalisation. The source (Q.7, Test Yourself) defines stop words as "words with negligible value that are often removed during preprocessing." In the sentence, she and reads and book carry meaning; only the is a typical stop word.

Q15. [1]
Which of the following words represents an example of stemming for the word 'Sharing'?
  1. (A) Share
  2. (B) Shared
  3. (C) Shares
  4. (D) Shar
Previously asked in: 2025 104/S Q2 (iv)
Generated by claude-sonnet-4-6 · 2026-06-21 03:19 · grounding rag
Model Answer

(D) Shar

Stemming cuts off affixes to get a base/root form, which may not be a real word. For 'Sharing', stemming removes '-ing' to give Shar, not a valid word.

Explanation

Stemming is a crude, rule-based process that simply chops off suffixes. It does not guarantee a meaningful word (unlike lemmatization, which gives 'Share'). 'Shar' is the classic stemmed output of 'Sharing'. Examiners expect you to distinguish stemming (may produce non-words) from lemmatization (always produces valid base words).

Q16. [4]
Document 1 : NLP is a domain of AI. Document 2 : NLP stands for Natural Language Processing.
Consider the following documents : Implement all the four steps of Bag of Words (BoW) model to create a document vector table.
Previously asked in: 2024 104 Q20
Generated by claude-sonnet-4-6 · 2026-06-21 03:19 · grounding stimulus
Model Answer

Step 1 – Collect Data (Documents):

Step 2 – Tokenisation (create unique word list/vocabulary):
NLP, is, a, domain, of, AI, stands, for, Natural, Language, Processing
(Total 11 unique words)

Step 3 – Create Document Vectors (word frequency count):

| Word | Doc 1 | Doc 2 |
|------|-------|-------|
| NLP | 1 | 1 |
| is | 1 | 0 |
| a | 1 | 0 |
| domain | 1 | 0 |
| of | 1 | 0 |
| AI | 1 | 0 |
| stands | 0 | 1 |
| for | 0 | 1 |
| Natural | 0 | 1 |
| Language | 0 | 1 |
| Processing | 0 | 1 |

Step 4 – Apply BoW Model:
Each document is represented as a vector of word frequencies:

Source: AI Chapter, Natural Language Processing – Bag of Words

---

Explanation

Examiners award 1 mark per step. Ensure all four steps are clearly labelled. The vocabulary must list unique words only. The document vector table (frequency count) is the core deliverable — write it neatly. Final vectors in bracket notation reinforce Step 4 and show you understand the BoW output format.

Q17. [2]
What is the primary difference between Human Language and Computer Language ?
Previously asked in: 2024 104 Q15
Generated by claude-sonnet-4-6 · 2026-06-21 03:19 · grounding rag
Model Answer

Human Language vs Computer Language:

Human languages (natural languages) are complex, redundant, context-dependent, and change over time. They use syntax, semantics, and lexicon to convey meaning. Computer languages (e.g., Python, C++, Java) are numerical/binary in nature — structured, precise, and follow strict instructions. Computers cannot directly understand natural language; NLP is needed to bridge this gap.

Source: Chapter 6, Section 6.1 – Introduction

---

Explanation
Q18. [2]
Differentiate between Computer Vision (CV) and Natural Language Processing (NLP).
Previously asked in: 2024 104 Q11
Generated by claude-sonnet-4-6 · 2026-06-21 03:18 · grounding rag
Model Answer

| Basis | Computer Vision (CV) | Natural Language Processing (NLP) |
|-------|---------------------|----------------------------------|
| Definition | Enables machines to interpret and understand visual information (images/videos). | Enables machines to understand and process human language (text/speech). |
| Input | Images, videos | Text, audio/speech |
| Example | Facial recognition, object detection | Chatbots, language translation |

Source: Unit 5: Computer Vision, Chapter 5

Explanation
Q19. [1]
Which type of chat-bot has a wide functionality, is flexible and powerful, and works on bigger databases directly ?
Previously asked in: 2024 104 Q5 (vi)
Generated by claude-sonnet-4-6 · 2026-06-21 03:19 · grounding rag
Model Answer

Smart-bot has wide functionality, is flexible and powerful, and works on bigger databases directly.

Explanation

The question tests knowledge of the two types of chatbots: Script-bot (rule-based, limited) and Smart-bot (AI-powered, flexible, works on large databases). Examiners expect the specific term Smart-bot as the answer.

Q20. [1]
In the context of NLP, which of the following words represents a stem resulting from stemming for "Studies" ?
  1. (A) Study
  2. (B) Stud
  3. (C) Studi
  4. (D) Studied
Previously asked in: 2024 104 Q5 (iv)
Generated by claude-sonnet-4-6 · 2026-06-21 03:19 · grounding rag
Model Answer

(C) Studi

Stemming chops off suffixes mechanically without checking meaning. For "Studies," the stemmer removes the suffix, producing "Studi" — not a real word, unlike lemmatization which gives "Study."

Explanation

Stemming is a crude, rule-based process that strips affixes regardless of whether the result is a valid word. "Studi" is the classic example of stemming output for "Studies." This is also what distinguishes stemming from lemmatization — lemmatization would correctly return "Study." Examiners specifically test this distinction, so remember: stemming → "Studi" (may be meaningless); lemmatization → "Study" (meaningful root word).

Q21. [1]
Which of the following applications of NLP (Natural Language Processing) is associated with spam filtering in e-mails ?
  1. (A) Virtual Assistants
  2. (B) Sentiment Analysis
  3. (C) Text Classification
  4. (D) Automatic Summarization
Previously asked in: 2024 104 Q4 (v)
Generated by claude-sonnet-4-6 · 2026-06-21 03:19 · grounding rag
Model Answer

(C) Text Classification

Text Classification categorizes documents into predefined groups, which is how spam filters sort emails into "spam" or "not spam" categories.

Explanation

The textbook defines Text Classification as a tool that "classifies a sentence or document category-wise" into predefined groups. Spam filtering works by classifying incoming emails into categories (spam/not spam), making Text Classification the correct answer. Virtual Assistants relate to voice processing, Sentiment Analysis detects emotions/opinions, and Automatic Summarization condenses text — none of these match spam filtering.

Q22. [1]
Which application of NLP helps to provide an overview of a news item or blog post ? It also avoids redundancy from multiple sources and maximises the diversity of content obtained.
  1. (A) Virtual Assistants
  2. (B) Sentiment Analysis
  3. (C) Text Classification
  4. (D) Automatic Summarization
Previously asked in: 2024 104 Q3 (v)
Generated by claude-sonnet-4-6 · 2026-06-21 03:19 · grounding rag
Model Answer

(D) Automatic Summarization

Explanation

Automatic Summarization provides an overview of news items or blog posts, avoids redundancy from multiple sources, and maximises content diversity — all key features that distinguish it from the other options. The source passages list it as a distinct NLP application separate from Text Classification (which categorises documents) and Sentiment Analysis (which detects opinion/emotion).

Q23. [1]
It is a domain-specific language that is designed for managing data held in different kinds of DBMS (Database Management System). It is particularly useful in handling structured data. Which computer language is this ?
  1. (A) SQL
  2. (B) CSV
  3. (C) Spreadsheet
  4. (D) TXT
Previously asked in: 2024 104 Q3 (iv)
Generated by claude-sonnet-4-6 · 2026-06-21 03:19 · grounding rag
Model Answer

(A) SQL

SQL (Structured Query Language) is a domain-specific language designed for managing data in DBMS. It is particularly useful in handling structured data.

Explanation

CSV is a file format, Spreadsheet is an application, and TXT is a plain text file — none are languages for managing databases. SQL is the standard language used to query, insert, update, and delete data in relational databases. This is general computer science knowledge; the source passages focus on NLP, so recall this from your IT/CS fundamentals.

Q24. [1]
A corpus contains 4 documents in which the words such as 'an, is, the' were appearing frequently. Identify the term that is used for such words.
  1. (A) Stop word
  2. (B) Rare word
  3. (C) Missing word
  4. (D) Removable word
Previously asked in: 2024 104 Q2 (vi)
Generated by claude-sonnet-4-6 · 2026-06-21 03:19 · grounding rag
Model Answer

(A) Stop word

Words like 'an', 'is', 'the' that appear frequently in a corpus but add negligible value to the text are called Stop words.

Explanation

As per the textbook (Chapter 6), stop words are defined as words with frequent occurrence in the corpus that have negligible value and are often removed during preprocessing. The MCQ option (A) directly matches this definition. Students must not confuse stop words with "removable words" — the correct technical term is stop word.

Q25. [1]
Spam refers to
  1. (A) Unnecessary images
  2. (B) Temporary files
  3. (C) Junk mails
  4. (D) Music files
Previously asked in: 2024 104 Q1 (ii)
Generated by claude-sonnet-4-6 · 2026-06-21 03:19 · grounding rag
Model Answer

(C) Junk mails

Explanation

Spam refers to unsolicited or junk emails sent in bulk. This is a standard IT literacy definition tested in CBSE. The source passages do not directly define spam, but option (C) is the universally accepted correct answer.

Q26. [4]
Consider the following two documents : Document 1 : ML and DL are part of AI. Document 2 : DL is a subset of ML. Implement all four steps of the Bag of Words (BoW) model to create a document vector table. Depict the outcome of each step.
Previously asked in: 2024 104 Q19
Generated by claude-sonnet-4-6 · 2026-06-21 03:19 · grounding rag
Model Answer

Step 1: Text Processing (Pre-processing)
Remove stop words (is, a, of) and convert to lowercase.

Step 2: Create a Dictionary (Vocabulary)
List all unique words from both documents:

| ml | dl | part | ai | subset |
|----|----|------|----|--------|

Step 3: Create Document Vector for Doc 1

| ml | dl | part | ai | subset |
|----|----|------|----|--------|
| 1 | 1 | 1 | 1 | 0 |

Step 4: Create Document Vectors for All Documents

| Document | ml | dl | part | ai | subset |
|----------|----|----|------|----|--------|
| Doc 1 | 1 | 1 | 1 | 1 | 0 |
| Doc 2 | 1 | 1 | 0 | 0 | 1 |

Source: Chapter 6, Section 6.5 – Bag of Words

---

Explanation
Q27. [2]
What are the primary differences between Script-bots and Smart-bots ?
Previously asked in: 2024 104 Q12
Generated by claude-sonnet-4-6 · 2026-06-21 03:19 · grounding rag
Model Answer

Script-bot: A scripted or traditional chatbot that follows pre-defined rules and fixed responses. It can only answer questions within its programmed script and cannot handle queries outside it.

Smart-bot: An AI-powered chatbot that uses machine learning and NLP. It has broader knowledge, learns over time, and can handle a wider variety of conversations more naturally.

Source: Chapter 6, Section 6.4 – Chatbots

---

Explanation
Q28. [1]
Which domain of AI is used for interacting with virtual assistants such as Siri and Alexa ?
  1. (a) Machine Learning (ML)
  2. (b) Computer Vision (CV)
  3. (c) Natural Language Processing (NLP)
  4. (d) Technical Vision (TV)
Previously asked in: 2024 104 Q5 (vi)
Generated by claude-sonnet-4-6 · 2026-06-21 03:19 · grounding rag
Model Answer

(c) Natural Language Processing (NLP)

Voice assistants like Siri and Alexa take natural speech, process it, and execute tasks — this is done using Natural Language Processing (NLP).

Explanation

The passage explicitly states: "Hey Alexa, play some music" and "Hey Siri, what's the weather today" as examples of Voice Assistants, listed under Applications of Natural Language Processing. NLP enables machines to understand and respond to human speech. ML is a broader domain; CV deals with images; "Technical Vision (TV)" does not exist as an AI domain.

Q29. [1]
Bag of Words is a ________ model which helps in extracting features out of the text which can be helpful in machine learning algorithms.
  1. (a) Data Science (DS)
  2. (b) Virtual Reality (VR)
  3. (c) Natural Language Processing (NLP)
  4. (d) Computer Vision (CV)
Previously asked in: 2024 104 Q4 (iii)
Generated by claude-sonnet-4-6 · 2026-06-21 03:19 · grounding rag
Model Answer

(c) Natural Language Processing (NLP)

Bag of Words is a Natural Language Processing (NLP) model which helps in extracting features out of the text which can be helpful in machine learning algorithms.

Explanation

The source passage (Chapter 6) lists "Bag of Words" as a key concept under the NLP chapter, and the Test Yourself section (Q6) confirms its purpose is "to extract features from text for machine learning algorithms." Students must remember BoW is an NLP concept, not DS, VR, or CV.

Q30. [1]
Which of the following applications is not associated with Natural Language Processing (NLP) ?
  1. (a) Sentiment Analysis
  2. (b) Speech Recognition
  3. (c) Spam Filtering in emails
  4. (d) Stock Market Analysis
Previously asked in: 2024 104 Q3 (vi)
Generated by claude-sonnet-4-6 · 2026-06-21 03:19 · grounding rag
Model Answer

(d) Stock Market Analysis

Stock Market Analysis is not an NLP application. Sentiment Analysis, Speech Recognition, and Spam Filtering all involve processing natural language text or speech.

Explanation

The textbook (Chapter 6) lists NLP applications as voice assistants, sentiment analysis, text classification, keyword extraction, language translation, and autogenerated captions. Spam filtering uses text classification (an NLP task), and speech recognition processes natural language. Stock Market Analysis is primarily a financial/statistical task, not an NLP application. Examiners expect direct identification of the odd one out with a brief reason.

Q31. [1]
Which of the following words represent an example of a lemma resulting from lemmatisation for "caring" in context to Natural Language Processing (NLP) ?
  1. (a) Care
  2. (b) Cared
  3. (c) Cares
  4. (d) Car
Previously asked in: 2024 104 Q2 (iv)
Generated by claude-sonnet-4-6 · 2026-06-21 03:19 · grounding rag
Model Answer

(a) Care

Lemmatisation reduces a word to its base dictionary form (lemma). The lemma of "caring" is "care", which is a meaningful root word.

Explanation

Lemmatisation always produces a valid dictionary word, unlike stemming which may produce incomplete forms. "Car" is unrelated; "cared" and "cares" are inflected forms, not the base lemma. Examiners expect students to know that lemmatisation = meaningful root word (lemma).

Q32. [1]
This real life application of NLP is used to provide an overview of a news item or blog post, while avoiding redundancy from multiple sources and maximising the diversity of content obtained. Which is this application ?
  1. (a) Chatbot
  2. (b) Virtual Assistant
  3. (c) Sentiment Analysis
  4. (d) Automatic Summarisation
Previously asked in: 2024 104 Q2 (ii)
Generated by claude-sonnet-4-6 · 2026-06-21 03:19 · grounding rag
Model Answer

(d) Automatic Summarisation

This application provides an overview of a news item or blog post, avoids redundancy from multiple sources, and maximises diversity of content.

Explanation

The question describes the key features of Automatic Summarisation — summarising content, removing redundancy, and maximising diversity. Chatbots simulate conversation, Virtual Assistants execute voice tasks, and Sentiment Analysis detects opinions — none match this description. Choose the option that best fits all three conditions given in the question.

Q33. [4]
Create a document vector table from the following documents by implementing all the four steps of Bag of words model. Also depict the outcome of each step. Document 1: Sameera and Sanya are classmates. Document 2: Sameera likes dancing but Sanya loves to study mathematics.
Previously asked in: 2023 104 Q19
Generated by claude-sonnet-4-6 · 2026-06-21 03:19 · grounding rag
Model Answer

Step 1: Text Processing (Pre-processing)
Remove stop words (and, are, but, to), convert to lowercase.

Step 2: Create Dictionary (Vocabulary)
List all unique words from both documents:

| sameera | sanya | classmates | likes | dancing | loves | study | mathematics |
|---------|-------|------------|-------|---------|-------|-------|-------------|

Step 3: Create Document Vector for Doc 1

| sameera | sanya | classmates | likes | dancing | loves | study | mathematics |
|---------|-------|------------|-------|---------|-------|-------|-------------|
| 1 | 1 | 1 | 0 | 0 | 0 | 0 | 0 |

Step 4: Document Vector Table for All Documents

| sameera | sanya | classmates | likes | dancing | loves | study | mathematics |
|---------|-------|------------|-------|---------|-------|-------|-------------|
| 1 | 1 | 1 | 0 | 0 | 0 | 0 | 0 |
| 1 | 1 | 0 | 1 | 1 | 1 | 1 | 1 |

Source: Chapter 6, Section 6.5 – Bag of Words

---

Explanation
Q34. [2]
With reference to data processing, expand the term TFIDF. Also give any two applications of TFIDF.
Previously asked in: 2023 104 Q16
Generated by claude-sonnet-4-6 · 2026-06-21 03:18 · grounding rag
Model Answer

TFIDF stands for Term Frequency-Inverse Document Frequency.

Two applications of TFIDF:

  1. Keyword Extraction – It helps identify the most important/valuable words in a document by assigning higher scores to rare but significant terms.
  2. Text Classification – It is used to classify documents into categories by determining the relevance of words within them.
Explanation
Q35. [2]
Define Chatbot. What are its types?
Previously asked in: 2023 104 Q13
Generated by claude-sonnet-4-6 · 2026-06-21 03:18 · grounding rag
Model Answer

A chatbot is a computer program designed to simulate human conversation through voice commands, text chats, or both. It can learn over time how to best interact with humans.

Types of Chatbots:

  1. Script-bot – A traditional, scripted chatbot that follows predefined rules.
  2. Smart-bot – An AI-powered chatbot with greater knowledge and learning ability.

Source: Chapter 6, Section 6.4 – Chatbots

---

Explanation
Q36. [1]
Smart Assistants such as Alexa, Siri are the examples of:
  1. (a) Natural Language Processing
  2. (b) Data Science
  3. (c) Machine Learning
  4. (d) Computer Vision
Previously asked in: 2023 104 Q5 (iii)
Generated by claude-sonnet-4-6 · 2026-06-21 03:18 · grounding rag
Model Answer

(a) Natural Language Processing

Alexa and Siri are voice assistants that leverage NLP to understand natural speech and execute tasks efficiently.

Source: Chapter 6, Section 6.2 – Applications of Natural Language Processing

---

Explanation

The passage explicitly states: "Hey Alexa, play some music" and "Hey Siri, what's the weather today" as examples under Voice Assistants, which is listed as an application of Natural Language Processing. In MCQs, always look for the exact term used in the textbook against the given examples.

Q37. [1]
______ is a term used for any word or number or special character occurring in a sentence. (Token / Punctuator)
Previously asked in: 2023 104 Q5 (i)
Generated by claude-sonnet-4-6 · 2026-06-21 03:18 · grounding rag
Model Answer

Token is a term used for any word or number or special character occurring in a sentence.

Source: Chapter 6, Section 6.3 / Text Normalisation – Tokenisation

Explanation

The examiner expects the single correct term "Token." In tokenisation (a step of Text Normalisation in NLP), the text is broken into smaller units called tokens — these can be words, numbers, or special characters. Do not write "Punctuator," as that refers only to punctuation marks, not all types of units in a sentence.

Q38. [1]
Which of the following is a feature of document classification?
  1. (a) Helps in classifying the type and genre of a document.
  2. (b) Helps in creating a document.
  3. (c) Helps to display important information of a corpus.
  4. (d) Helps in including the necessary words in the text body.
Previously asked in: 2023 104 Q4 (i)
Generated by claude-sonnet-4-6 · 2026-06-21 03:18 · grounding rag
Model Answer

(a) Helps in classifying the type and genre of a document.

Text Classification (document classification) classifies a sentence or document category-wise into predefined groups or categories.

Explanation

The source passage under Applications of NLP describes Text Classification as a tool that "classifies a sentence or document category-wise" into predefined groups — i.e., it classifies the type and genre of a document. The other options describe unrelated NLP tasks (corpus display, document creation, word inclusion), so (a) is the only correct answer.

Q39. [1]
With reference to NLP, consider the following plot of occurrence of words versus their value: In the given graph, X represents:
  1. (a) Rare / valuable words
  2. (b) Punctuation words
  3. (c) Popular words
  4. (d) Pronoun
Previously asked in: 2023 104 Q3 (vi)
Generated by claude-sonnet-4-6 · 2026-06-21 03:18 · grounding rag
Model Answer

(a) Rare / valuable words

X represents rare / valuable words — words that occur the least in the corpus but add the most value to the text analysis.

Explanation

The graph follows Zipf's Law distribution in NLP. Stop words (high occurrence, low value) appear at the top-left; frequent words are in the middle; and at the bottom-right (lowest occurrence), point X marks rare/valuable words — they appear least but carry the most meaning. The textbook (Test Yourself Q8) confirms: "They occur the least but add the most value to the corpus." Examiners expect students to read the graph direction correctly and link low occurrence with high value.

Q40. [1]
For ______ the whole corpus is divided into sentences. Each sentence is taken as a different data so now the whole corpus gets reduced to sentences.
  1. (a) Text Regulation
  2. (b) Sentence Segmentation
  3. (c) Tokenisation
  4. (d) Stemming
Previously asked in: 2023 104 Q3 (iii)
Generated by claude-sonnet-4-6 · 2026-06-21 03:18 · grounding rag
Model Answer

(b) Sentence Segmentation

Under Sentence Segmentation, the whole corpus is divided into sentences, and each sentence is taken as different data, reducing the corpus to sentences.

Source: Text Processing, Section 6.5

---

Explanation

The passage in Section 6.5 directly defines Sentence Segmentation as the step where "the whole corpus is divided into sentences" — the question uses this exact language. Tokenisation comes after segmentation (it breaks sentences into individual tokens). Students must not confuse the two steps.

Q41. [1]
Select the correct features of Smart Bot:
  1. (a) Smart-bots are flexible and powerful
  2. (b) Coding is required to take this up on board
  3. (c) Smart bots work on bigger databases and other resources directly
  4. (d) All of the above
Previously asked in: 2023 104 Q3 (ii)
Generated by claude-sonnet-4-6 · 2026-06-21 03:18 · grounding rag
Model Answer

(a) Smart-bots are flexible and powerful

The correct answer is (a). Smart-bots are flexible and powerful and work on bigger databases, but no coding is required — so option (b) is incorrect, making (d) wrong too.

---

Explanation

The passage states there are two types of chatbots: Script-bots and Smart-bots. Smart-bots are AI-powered, flexible, and powerful. A key feature of Smart-bots is that no coding is required (they are no-code tools), which directly eliminates option (b) and therefore option (d). Examiners expect you to identify the false statement in such MCQs to eliminate "All of the above."

Q42. [1]
Email filters, spam filters, smart assistants are the examples of:
  1. (a) Pocket Assistants
  2. (b) CV
  3. (c) NLP
  4. (d) Evaluation
Previously asked in: 2023 104 Q3 (i)
Generated by claude-sonnet-4-6 · 2026-06-21 03:18 · grounding rag
Model Answer

(c) NLP

Email filters, spam filters, and smart assistants are real-world applications of Natural Language Processing (NLP).

Explanation

The source passages (Chapter 6, Section 6.2) list voice assistants (smart assistants like Siri, Alexa, Google) as key NLP applications. Email/spam filters also use NLP to classify and process text. Examiners expect you to directly identify the correct option and briefly justify it in one line.

Q43. [1]
Two popular examples of pocket assistants are _____ and _____.
Previously asked in: 2023 104 Q2 (i)
Generated by claude-sonnet-4-6 · 2026-06-21 03:18 · grounding rag
Model Answer

Two popular examples of pocket assistants (voice assistants) are Google Assistant (Hey Google) and Siri (Hey Siri).

Source: Chapter 6, Section 6.2 – Applications of Natural Language Processing

---

Explanation

The source passage lists voice assistants such as Google, Alexa, and Siri as examples. "Pocket assistants" refers to voice/virtual assistants on personal devices. Acceptable answers include any two from: Google Assistant, Siri, or Alexa. Examiners expect recognisable names — just naming two correctly earns the mark.

Q44. [4]
With reference to NLP, explain the following terms in detail with the help of suitable example: • Term frequency • Inverse Document Frequency
Previously asked in: 2022 104 Q20
Generated by claude-sonnet-4-6 · 2026-06-21 03:18 · grounding rag
Model Answer

Term Frequency (TF):
Term Frequency measures how often a word appears in a document. It helps identify how important a word is within a single document.

Formula: TF = (Number of times word appears in document) / (Total words in document)

Example: In Document: "Johny Johny Yes Papa" — TF of "Johny" = 2/4 = 0.5

---

Inverse Document Frequency (IDF):
IDF measures how rare or valuable a word is across all documents in the corpus. Common words (like "the") get a low IDF; rare words get a high IDF.

Formula: IDF = log(Total documents / Documents containing the word)

Example: If "Papa" appears in 3 out of 4 documents — IDF of "Papa" = log(4/3) ≈ 0.125

TFIDF = TF × IDF — words that are frequent in one document but rare across the corpus are most valuable.

Source: Chapter 6, Section 6 (TFIDF concept)

---

Explanation
Q45. [4]
Consider the text of following documents: Document 1: Sahil likes to play cricket Document 2: Sajal likes cricket too Document 3: Sajal also likes to play basketball Apply all the four steps of Bag of words model of NLP on the above given documents and generate the output.
Previously asked in: 2022 104 Q19
Generated by claude-sonnet-4-6 · 2026-06-21 03:18 · grounding rag
Model Answer

Step 1: Collect Data (Documents)

Step 2: Create a List of Unique Words (Vocabulary)
{sahil, likes, to, play, cricket, sajal, too, also, basketball}
(9 unique words)

Step 3: Remove Stop Words (optional normalisation)
Stop words like "to", "too", "also" may be removed → Vocabulary: {sahil, likes, play, cricket, sajal, basketball}

Step 4: Create Document Vectors (Frequency Table)

| Word | Doc1 | Doc2 | Doc3 |
|----------|------|------|------|
| sahil | 1 | 0 | 0 |
| likes | 1 | 1 | 1 |
| play | 1 | 0 | 1 |
| cricket | 1 | 1 | 0 |
| sajal | 0 | 1 | 1 |
| basketball | 0 | 0 | 1 |

Each document is now represented as a numerical vector based on word frequency.

Source: Chapter 6, Bag of Words Model

---

Explanation
Q46. [2]
Explain the following picture which depicts one of the processes on NLP. Also mention the purpose which will be achieved by this process.
Previously asked in: 2022 104 Q18
Generated by claude-sonnet-4-6 · 2026-06-21 03:18 · grounding rag
Model Answer

The diagram depicts Converting Text to a Common Case, a step in Text Normalisation (Text Processing) in NLP.

In this process, all words — regardless of their capitalisation (e.g., HELLO, HeLlo, HELLo) — are converted to lowercase (hello).

Purpose: This ensures uniformity in word representation, so that the machine does not treat the same word written in different cases as different words, thereby improving accuracy of text analysis.

Source: Chapter 6, Section 6.5 – Text Processing (Text Normalisation)

---

Explanation
Q47. [2]
Kaira, a beginner in the field of NLP is trying to understand the process of Stemming. Help her in filling up the following table by suggesting appropriate affixes and stem of the words mentioned there:
Previously asked in: 2022 104 Q16
Generated by claude-sonnet-4-6 · 2026-06-21 03:18 · grounding rag
Model Answer

| S. No. | Word | Affixes | Stem |
|--------|------|---------|------|
| i. | Tries | -es | Try |
| ii. | Learning | -ing | Learn |

Stemming removes affixes (prefixes/suffixes) from a word to obtain its root/stem form.

Source: Chapter 6, Text Processing / NLP Stages

---

Explanation
Q48. [2]
What is Tokenization? Count how many tokens are present in the following statement: I find that the harder I work, the more luck I seem to have.
Previously asked in: 2022 104 Q15
Generated by claude-sonnet-4-6 · 2026-06-21 03:18 · grounding rag
Model Answer

Tokenization is a step in Text Normalisation where each sentence is divided into smaller units called tokens. A token can be any word, number, or special character occurring in a sentence.

Counting tokens in: I find that the harder I work, the more luck I seem to have.

Tokens: I / find / that / the / harder / I / work / , / the / more / luck / I / seem / to / have / .

Total number of tokens = 16

Source: Text Normalisation – Tokenization, Chapter 6

---

Explanation
Q49. [2]
Differentiate between Script-bot and Smart-bot.
Previously asked in: 2025 104/S Q12; 2022 104 Q13 —
Generated by claude-sonnet-4-6 · 2026-06-21 03:18 · grounding rag
Model Answer

| Basis | Script-bot | Smart-bot |
|-------|-----------|-----------|
| Nature | Traditional/scripted chatbot | AI-powered chatbot |
| Knowledge | Follows pre-written scripts; limited responses | Has broader knowledge; learns and adapts over time |
| Interaction | Feels robotic; cannot handle unknown queries | Feels more human-like; handles varied conversations |

Script-bots respond only within fixed, programmed rules, while Smart-bots use Artificial Intelligence to understand context and give more flexible, intelligent responses.

Source: Chapter 6, Section 6.4 — Chatbots

---

Explanation
Q50. [1]
Name the process of dividing whole corpus into sentences.
Previously asked in: 2022 104 Q10
Generated by claude-sonnet-4-6 · 2026-06-21 03:18 · grounding rag
Model Answer

The process of dividing the whole corpus into sentences is called Lexical Analysis.

Source: Stages of Natural Language Processing (NLP), chapter 6

Explanation

Lexical Analysis is the first stage of NLP where large chunks of text are broken down into paragraphs, sentences, and words. Examiners expect the exact term "Lexical Analysis" for full marks.

Q51. [1]
Name any two currently popular virtual assistants.
Previously asked in: 2022 104 Q9
Generated by claude-sonnet-4-6 · 2026-06-21 03:18 · grounding rag
Model Answer

Two currently popular virtual assistants are Google Assistant (Hey Google) and Siri (Apple). Alexa (Amazon) is also a widely used voice assistant.

Source: Applications of Natural Language Processing, chapter 6, section 6.2

Explanation

The source passage lists three voice assistants as examples under "Voice assistants" in section 6.2 — Google Assistant, Alexa, and Siri. For a 1-mark question, naming any two is sufficient. Avoid over-explaining; just name them clearly.

Q52. [1]
Mention any two commonly used applications of NLP.
Previously asked in: 2022 104 Q8
Generated by claude-sonnet-4-6 · 2026-06-21 03:18 · grounding rag
Model Answer

Two commonly used applications of NLP are: (i) Voice Assistants (e.g., Google Assistant, Siri) that understand and process natural speech, and (ii) Language Translation (e.g., Google Translate) that converts text from one language to another.

Source: Applications of Natural Language Processing, chapter 6, section 6.2

---

Explanation

For a 1-mark question, name any two applications clearly — one line is enough. The source lists six applications: Voice Assistants, Autogenerated Captions, Language Translation, Sentiment Analysis, Text Classification, and Keyword Extraction. Pick any two and briefly state what they do. Avoid writing long descriptions; the examiner awards the mark for correct identification.

Q53. [1]
What is NLP?
Previously asked in: 2022 104 Q7
Generated by claude-sonnet-4-6 · 2026-06-21 03:18 · grounding rag
Model Answer

NLP (Natural Language Processing) is a branch of Artificial Intelligence that enables computers to understand, interpret, and process human (natural) language.

Source: Chapter 6, Unit Overview

Explanation

For a 1-mark question, one clear definitional sentence is sufficient. Examiners look for two key ideas: (1) it is an AI technique, and (2) it deals with human/natural language understanding and processing. Avoid listing applications or stages — that exceeds the mark allocation.

Previous-year CBSE Grade 10 board exam questions, organised by subject and chapter, each with a model answer — free to read and print.