Skip to main content

Posts

Showing posts from January, 2024

Global and Local Models

Recently we have seen a number of chatbots(chatGPT, BARD etc). These comes under LLMs(large language models) which are trained on huge datasets. I will discuss the methods which these models follow to give personalized and accurate responses. I will break this into three steps: Information gathering: gathers users search history, topics discussed, queries etc Information clustring: clusters related concepts, facts and ideas Personalized responses: searches information cluster to find most relevant and helpful information The information clustering process is a knowledge acquisition and adaptation process, which shares some similarity with online learning but has some key differences. Knowledge acquisition(KA) vs subset selection(SS): KA is a broader concept, it encompasses various methods and processes of gathering and accumulating new knowledge or information. This includes learning from experiences, interactions, readings, and various sources while SS is a narrower concepts. It re...

Sequence Networks

Hi, in this blog I will be covering most of the sequence networks. Sequence networks can have either input as sequence or output as sequence or both(input and output) as sequences. We can sub-divide these sequence networks in the following three ways: Vec2Seq Seq2Vec Seq2Seq 1. Vec2Seq(Sequence Generation):           where is hidden state and (initial hidden state distribution). For categorical and real-valued output, the distributions are given by: The above generative model is called Recurrent Neural Network(RNN) . 2. Seq2Vec(Sequence Classification):           In classification task, the output is class label, We get better results if we let the hidden states depends on past as well as future context bidirectional RNN . Then we define, hidden state at time , . where 3. Seq2Seq(Sequence Translation):           Aligned case where the initial state, ...