Notes On and From Natural Language Processing

Natural Language Processing Watson Neural Networks Sophia Notes with video conference background, courses tag on top

This post is a continuation of my thoughts on NLP course from Grad Studies Fall 2020 Summary. Earlier posts related to NLP on this blog: Natural Language Processing Paper Discussion: Novel Chapters Summarization and Natural Language Processing PCPR Pun Paper Presentation

Computational Linguistics courses like Applied Information Technology 590: Introduction to Natural language Processing would have really benefited from the process of self inquiry, discovery, cognitive psychology studies, and having us consider the wonders of the human mind itself first. Abstract concepts like Mathematics are harder to teach, even harder to grasp, and require much more tenacious methodology to hold on to. But learning them with reference to the real-world practical applications is an art form itself. I wish for more in-depth 3brown1blue or Crash Course kind of content when it comes to mathematical concepts, and related statistics, machine learning concepts too. If I were to redesign the course myself however, I would start with the current discussions for NLP most relevant to the practices of today; the types of historical linguistics studies using language technology, the ethical considerations of existing practices like inclusive or server spaces, the philosophical insidious nature of smart home devices, etc. Masters degrees are meant to be more industry-focused than the research-focused PhD studies yes, and the direct applicability of humanities way of thinking is extremely pertinent to those of us going on to work in that capacity.

Then I would incorporate step-by-step guidance tests of multi-tiered assignments. I would not have minded the sheer difficulty of the coursework if I could have truly understood and worked them out myself – but then this was a course structured around teamwork / group effort. When the questions are unstructured, especially in the online school format, they might as well be unintelligible. I would give out multiple smaller assignments with increasing complexity and actual directions instead of being left to the wild because that would ultimately only teach me how to Google / search for answers online. If I were to do the coding myself, I genuinely believe I could with some direction and amount of requirements actually hashed out. The multi-step pronged approach of leveling up could culminate in certain checkpoint mini projects instead of the entire project thrown out for us to tackle. So I tried. I tried to work on what I wanted to achieve from the course myself, which was one of the added stressors ultimately but quite fruitful.

In my networks of thoughts, Jorge Luis Borges’ work reminds me of the Digital Humanities concept of “Distant Learning” and the Natural Language Processing concept of “N-gram analysis”

NLP specifically would have benefited from an artsier mode of instruction. Jurafsky’s textbook was a delight for this very reason. Not only does it incorporate Broadway references and Noam Chomsky philosophy studies, it also combines the theoretical to the practical usage of NLP theories in daily life and beyond.

Nothing exists in binary, definitely not the tech being consumed for egregious purposes daily, and the future practitioners of such should be fundamentally taught that. It is,  I would argue, be more important than the actual learning curve of applying Python packages to algorithms developed by IBM and Microsoft professionals over months in a week’s time frame. Ultimately detrimental in the effort to submit assignments within their deadline, and keep resubmitting them to up the grade score. I did receive an A+ however uncertain I was of such a result given the towering-beast-to-be-tamed nature of the course.

Even after multiple requests for clarification, there was only so much that could be communicated via emails and TA office hours. Of course this could have been easier for a group endeavor, but yet another pitfall of virtual classes is the lack of true camaraderie of group members, which eventually led to me switching out teams midway through the semester in sheer frustration. The course outcome for me was what could be achieved by going through multiple StackOverflow, TowardsDataScience, and KDNuggets articles. In effect, I wish for more and at the same time, I was absolutely knackered out, and felt the rightful deserving grace of the eventual A+ grade.

This is a general critique I have for the education system on the whole. Instead of starting afresh linearly from the historical origins to the present, what could the benefit of going backwards from the here and now to the relevant canon be? It would entirely side-step the need to listen to people whose views no longer hold water for one. It would restructure the entire question of knowledge requirements with its immediate applicability in both industry and research work. Computer Sciences courses are truly lacking in their learning resources methodology even if they are employing some education technology (which is often not well-meaning – ethics of proctoring for example). They are geared towards a specific kind way of thinking which is hardly inclusive. One of the biggest lessons from the women in STEM representation on social media is knowing that the imposter syndrome is heartbreakingly common amongst many of us, the under-confidence is off the charts. As a lifelong sufferer of the sciences, I came to realize it was in not that I was not smart – I would simply benefit from a visual, auditory, possibly neurodivergent form of learning. I had to realize the resources into my existence and mesh them into my very being.

This was, nevertheless, incredibly important course for me, if not for realizing this thought process, but to be a part of Dr. Anamaria Berea’s NLP seminar and work on the Linear A Knowledge Mining and Natural Language Processing working system and hopeful journal publication. Besides, there was also the wondrous opportunity to interview as a PhD student in the AI:CULT project.

IBM Watson on Jeopardy!

Watson is a question-answering computer system capable of answering questions posed in natural language, developed as a part of IBM’s DeepQA project. Watson was named after IBM’s founder and first CEO, industrialist Thomas J. Watson.

David Ferucci, the Founder, CEO, and Chief Scientist of Elemental Cognition, a venture exploring a new field of study called natural learning, is a computer science researcher specializing in knowledge representation and reasoning. He led the team of scientists and academicians in the IBM Watson effort.

The documentary describes the IBM DeepQA team experiencing the nail-biting process of building a human-like intelligent system that can compete in a popular game show.  

1. The idea of IBM Watson was borne out of science fiction and speculative fantasies like  Fritz Lang’s “Metropolis”, Isaac Asimov’s “I, Robot” series, “Star Trek” TV series, as well as Stanley Kubrick and Arthur C. Clarke’s “2001: A Space Odyssey”. While the idea of a sentient machine mimicking human speech patterns was the more immediate goal, the human-like contextual referential expansive imitation of a human’s way of thinking was still a distant possibility for a computer system to achieve – until IBM Watson. 

2. The system “learned” the Jeopardy quizzes from the past 40 years and when faced with a Jeopardy challenge in a particular topic, looks up keywords in this database repository of knowledge.

3. IBM Watson was able to answer the Jeopardy quizzes in its specific Answer-with-a-Question format but could not grasp the nuances of competing against other humans. For example, since the quiz format was buzzer oriented, the machine answered with the same reply as the contestants who were in their turn itself, wrong. 

4. Moreover, the glitches in the answers that were being pulled from its database of reasoning and knowledge were multiple – keyword pronunciation for one, semantic irregularities (puns, double meanings, spatio-temporal cultural references, etc.) for another.

5. As an information seeking and retrieval tool, Watson’s unique capabilities can be applicable to a wide range of problem solving techniques including medical healthcare diagnosis by going through medical records big data

Once the producers of the game show, Jeopardy reviewed Watson’s performance, they were invited to consider the enhanced efficiency of Watson’s natural understanding with deeper or trickier spatio-temporal socio-cultural context categories of the game. It was recognized that while some categories that are factual and easy to look up as an encyclopedic knowledge would be easier for Watson to perform better than humans in, Watson still had to work out the type of categories that humans would have no trouble picking up clues with. However, the IBM DeepQA team could program an effective reinforcement learning component where Watson re-corrects itself in such cases. 

The computational themes range from Knowledge representation, Mining, Q-A systems, Keyword lookup, Human-Computer Interaction.

For my undergraduate final project I worked on IBM’s Language Understanding Intelligent Service (LUIS) entity and intent recognition to train and test a chat bot for event management. Here I came across Watson for Literature Survey and implemented a similar Q-A model on Node.js and MongoDB server.

Further Reading

How IBM Watson Overpromised and Under Delivered on AI HealthCare

Long before Watson starred on the Jeopardy! stage, IBM had considered its possibilities for health care. In 2015, IBM announced the formation of a special Watson Health division, and by mid-2016 Watson Health was acquired by four health-data companies for a total cost of about $4 billion. Many of these hospitals proudly use the IBM Watson brand in their marketing, telling patients that they’ll be getting AI-powered cancer care. At MD Anderson, researchers put Watson to work on leukemia patients’ health records-and quickly discovered how tough those records were to work with.

IBM’s Watson Is Everywhere—But What Is it?

Since winning Jeopardy! in 2011, IBM’s Watson has apparently found employment as a dress designer, a chef, and a movie director.

The piece talks about how IBM is trying to combine the AI capabilities that fall under the Watson brand with the work of conventional consultants. Even if you follow developments in AI closely it can be hard to keep track of all the things Watson can do. In truth, very little of the technology used to win Jeopardy! remains in Watson. In most cases, the roles Watson is supposedly taking on involve applying some version of machine learning in a novel area.

Getting Started with AI using IBM Watson Coursera Massive Online Open Course

As a quick introduction to Artificial Intelligence via the concepts in IBM Watson, this course by Rav Ahuja, an AI and Data Science Program Director at IBM discusses real life client case studies, Watson AI applications in smart applications via a hands-on approach.

Neural Models for Information Retrieval

Bhaskar Mitra is a Principal Applied Scientist at Microsoft AI & Research, Cambridge. He started working at Bing in 2007 (previously known as Live Search) working on several problems related to document ranking, query formulation, entity ranking, and evaluation. He was also a part of Microsoft labs in Hyderabad (India), Bellevue (USA), and Cambridge (UK). His current research interests as a doctoral student at university College London include representation learning and neural networks, and their applications to information retrieval. He co-organized multiple workshops (at SIGIR 2016 and 2017) and tutorials (at WSDM2017 and SIGIR 2017) on neural IR, and served as a guest editor for the special issue of the Information Retrieval Journal. 

he talk was about doing machine learning differently motivated by specific challenges in IR.It focuses on learning good vector representations of text for information retrieval. 

1. Neural ranking models for information retrieval (IR) use neural networks to rank search results in response to a query for auto-completion and prediction. 

2. The talk introduces Information Retrieval (IR) and different neural and non-neural approaches for learning feature space similarity and vector representations of text in the query. 

3. This talk reviews the shallow Neural networks that are pre-trained neural term embedding and their architectures as well as Deep Neural networks. 

4. For the retrieval of long and short text of variable length documents, query matching is relevant. 

5.  In vector representation, notions of similarity depend on what feature space you choose. 

One of the most well-known Information Retrieval tasks is web search where both query auto-completion and prediction occur from many long text documents. The framework is made up of input text and corresponding mapped candidate text that is represented, feature vectored, and then estimated for relevance by similarity of matches. The vector space representations are either local or distributed where similarity of the vectors can be determined. The desired similarity should depend on the target task which is modeling the relationships between them. 

Deep Semantic Similarity Model (DSSM) trains on document and suffix pairs for document ranking on web search. Dual Embedding Space model (DESM) keeps both the input and output layer matrices for word2vec to consider word associations. Queries could also be created by topic-specific term embedding for high ranking domain specific data with global corpus.  

Themes considered are various basic and advanced methodologies in neural networks for web search engines. The training datasets might be small, but the short and long text documents both appear as search results with various embedding and document contextual terminologies.  

There are several challenges with text retrieval which means managing several other issues first. 

This was a talk on fairly advanced concepts that I am yet to encounter in the classroom setting. I believe concepts in my other course for this semester Knowledge Mining will eventually incorporate Data Mining with Neural Networks and I hope to apply information retrieval concepts here. In the Scientific Databases course, I have come across matching similarity by Convoluted Neural Networks for Bioinformatics (protein similarity analysis). 

Further Reading

Tutorialspoint Information Retrieval

The main goal of IR research is to develop a model for retrieving information from the repositories of documents. Then the IR system will return the required documents related to the desired information. The query with terms “Social” and “Economic” will produce the documents set of documents that are indexed with both the terms. The primary goal of any information retrieval system must be accuracy to produce relevant documents as per the user’s requirement. 

NLP Meets the Jabberwocky

Author: Susan Feldman 

Some common problems of information retrieval are that as with any automatic system with stumbling blocks, and information retrieval systems are no exception, particularly since they must deal with the vagaries of human language. Current translating systems often rely on dictionaries and word-by-word exchanges resulting in translations, such as “How does Ca go?” from the original in French “Comment a va?” Retrieving ideas instead of words seems a more promising approach, and some of these systems are almost ready for the market. 

Natural Language Processing in Textual Information Retrieval and Related Topics

Authors: Mari Vallez; Rafael Pedraza-Jimenez 

Description: This article describes the key methodologies of NLP applied in information retrieval, several fields of research related to information retrieval and natural language processing specifically. Statistical processing methods of Information Retrieval like the “bag-of-words” model and document processing model try to map the documents’ words with that of the queries. Textual Information Retrieval and NLP techniques are often used both for facilitating descriptions of document content and for presenting the user’s query, all with the aim of comparing both descriptions and presenting the user the documents that best satisfy their information needs. 


Mari Vallez; Rafael Pedraza-Jimenez. Natural Language Processing in Textual Information Retrieval and Related Topics [en linea]. “”, num. 5, 2007. <>&nbsp;

Susan Feldman. Natural Language Processing in Information Retrieval. p. 13. May 1999. <

NLP – Information Retrieval – Tutorialspoint. <

Sophia AI

In this interview Dr. Hanson, creator of Sophia, answers several questions about Sophia, a humanoid robot. He explains how Sophia is the latest developments in the field of Artificial intelligence, who is capable of learning and gaining experience from human interactions. He describes her as a progressive individual, who is able talk on rights. In the interview, they reveal how different teams work on her in different areas, it makes you think about the vastness and depth of the project.

Dr. Hanson also addresses the concern that interviewers presented regarding the possibility of robots taking over, as she programmed to get smarter day by day through more and more interactions. He clears the idea behind creating her was to facilitate machines to learn like babies do, her AI is designed with human values, kindness, wisdom at its core. He envisions Sophia and other robots to serve to solve complicated problems in a smarter way. 

I have come across the concept discussed in the video in many many science fiction novels and movies, one of the most recent being Ex Machina and Blade Runner (based on Do Androids dream of Electric Sheep?). The human consensus is to be afraid of the unknown, however, in Computer Science, we learn that programs that operate machines like Sophia are a product of their creators. For example, in my undergraduate final project, I worked on training a chatbot to reply to user utterances with interesting pop-culture references in its responses which makes it seem more like a younger generation internet user, much like Sophia communicates with the show hosts. 

Further Reading

The first-ever robot citizen has 7 humanoid ‘siblings’ — here’s what they look like

Author: Chris Weller 

This article introduces us to 7 humanoid siblings of Sophia. It also describes various features and characteristics of these robots. Hanson Robotics was founded in 2005, and its first robot was Albert Einstein HUBO. It was the famous physicist’s head attached to a fully-upright HUBO robot body. Apparently, Sophia once said it would ‘destroy humans’!

Team Hanson-Lia-SingularityNet: Deep-learning Assessment of Emotional Dynamics Predicts Self-Transcendent Feelings During Constrained Brief Interactions with Emotionally Responsive AI Embedded in Android Technology

Author: Julia Mossbridge

This link goes more in depth about the facial recognition of the Sophia robot. It discusses that the robot can induce different feelings, such as love. It shows that researchers found that most individuals who interacted with the robots are happier rather than angry. With the help of her chest and eye cameras, Sophia was able to use her pre-trained neural network model to recognize a person’s facial expressions, making her more human-like.

Meet Sophia, the Robot That Looks Almost Human

Author: Michael Greshko

Description: This is an article about the journey of Sophia since her unveiling and it also shows different pictures of the lab in which the most intelligent robot was created. Sophia might make us recall the self-aware robots in Ex Machina or Westworld, but to be clear, no robots have yet achieved artificial general intelligence, or versatile humanlike smarts. According to a publication on Sophia’s software, deep neural networks let the robot discern someone’s emotions from their tone of voice and facial expression and react in kind.


“Meet Sophia, the Robot That Looks Almost Human,” Photography, May 18, 2018. (accessed Nov. 25, 2020).

“Sophia robot citizen has 7 robot siblings — here’s what they look like – Business Insider.” (accessed Nov. 25, 2020).

“The Making of Sophia: Facial Recognition, Expressions and the Loving AI Project – Hanson Robotics.” (accessed Nov. 25, 2020).

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.