Hello!

I’m Hasith, a senior at The University of Texas at Austin studying Physics and Mathematics. Some things that interest me are AI safety, Mechanistic Interpretability, Scientific Computing, and Reinforcement Learning. Feel free to reach out to me at hasith@utexas.edu for any discussions, questions, or collaborations–I’d love to listen to your ideas!

Induction Heads in Chronos Part 2

Previously, in the part 1 post, we found some evidence that induction heads exist in the Chronos models [1]. However, there were some things I did incorrectly and some things I wanted to further explore: First, my implementation of the repeated random tokens (RRT) method was incorrect. Namely, I randomly sampled over all the non-special tokens, but Chronos scales the given input such the encoder input tokens almost always fall within a range of token ids from 1910-2187. Sampling over only this range greatly improved the attention mosaics. I wanted to further study how changing the number of repeitions and the lengths of the individual sequences in the RRT affects how many induction heads we detect. I wanted to go beyond RRT data and see if we can find any interesting inductive properties in multisine data. Background First, let me clear up what an induction head actually is in a more concrete way than my last post. ...

May 28, 2025 · 6 min

Hunting for Induction Heads in Amazon's Chronos

Notice: While the theory here is correct, I realized I had some implementation errors which are corrected in a follow up post. This Summer, I expect to be working on things related to mechanistic intepretability in time series forecasting, and a model of interest was Amazon’s Chronos model, a probabilistic time series forecasting model. To better understand how the model works and to get my hands dirty with some MI work, I decided to try and look for evidence of induction heads in Chronos. ...

May 11, 2025 · 7 min

A Clock Hand Puzzle

I used to not like analog clocks because they unecessarily made it harder to tell time in a world where digital clocks are a reality. Now, I appreciate them a lot more for all the mathematical fun that they present. So, here’s a very simple puzzle I thought of while looking at one. The Puzzle It is 3:00 right now on an analog clock. How much longer do I have to wait to see the minute and the hour hands cross each other? ...

December 31, 2024 · 1 min

Metrobike Optimization Around UT Austin

This project was done as our final project for William Gilpin’s Graduate Computational Physics Course. Our complete GitHub repository, with instructions on how to replicate our results, can be found here. Introduction The goal of this project is to simulate the behavior of a bike-sharing system in a network of stations and destinations, and then optimize the positions of the stations. We approach the simulation of the bike-sharing system with Agent Based Modeling (ABM). ...

December 10, 2024 · 12 min · Hasith Vattikuti, Eric Liang, Dev Desai, Viren Govin

Review of "Planting Undetectable Backdoors in Machine Learning Models" paper by Goldwasser

Notes on the paper Planting Undetectable Backdoors in Machine Learning Models by Shafi Goldwasser, Michael P. Kim, Vinod Vaikuntanathan, and Or Zamir. This paper was recommended to me by Scott Aaronson if I wanted to better understand some earlier, more cryptographic/theoretical work in backdooring neural networks. I am also reading through Anthropic’s Sleeper Agents paper, which is more recent and practical in its approach to backdooring current LLMs, those notes will be posted soon as well. ...

November 4, 2024 · 9 min · Hasith Vattikuti

Is Basketball a Random Walk?

About two years ago, I attended a seminar given by Dr. Sid Redner of the Santa Fe Institute titled, “Is Basketball Scoring a Random Walk?” I was certainly skeptical that such an exciting game shared similarities with coin flipping, but, nevertheless, Dr. Redner went on to convince me–and surely many other audience members–that basketball does indeed exhibit behavior akin to a random walk. At the very end of his lecture, Dr. Redner said something along the lines of, “the obvious betting applications are left as an exercise to the audience.” So, as enthusiastic audience members, let’s try to tackle this exercise. ...

August 17, 2024 · 8 min · Hasith Vattikuti

6.2 - The Invariance Principle

Let $\{\xi_m\}_{n \in \mathbb{N}}$ be a sequence of i.i.d. random variables such that $\mathbb{E}[\xi_n] = 0$ and $\mathbb{E}[\xi_n^2] = 1$. Then, define $$S_0 = 0, \quad S_N = \sum_{i=1}^N \xi_i$$and by the Central Limit Theorem, rescaling $S_N$ by $\sqrt{N}$, we get that $$\frac{S_N}{\sqrt{N}} \xrightarrow{d} \mathcal{N}(0,1)$$ (the $\xrightarrow{d}$ means convergence in distribution) as $N \rightarrow \infty$. Using this, we can define a continuous random function $W^N_t$ on $t \in [0,1]$ such that $W_0^N = 0$ and ...

August 12, 2024 · 1 min · Hasith Vattikuti

A Simple Boarding Puzzle

The Puzzle Inspired by true events Alice is assigned to be the 56th passenger to board a full plane with 60 seats. However, a panic causes all the passengers–including Alice–to arrange themselves radomly in line to board. As Alice was originally 56th, she decides that she would be happy as long as passengers with the assigned spots 57, 58, 59, and 60 are not in front of her. What is the probability that Alice will be happy? ...

August 12, 2024 · 1 min · Hasith Vattikuti

6.1 - The Diffusion Limit of Random Walks

Random Walk Let $\{\xi_i\}$ be i.i.d. random variables such that $\xi_i = \pm 1$ with probability $1/2$. Then, define $$X_n = \sum_{k=1}^{n} \xi_k, \quad X_0 = 0.$$ $\{X_n\}$ is the familiar symmetric random walk on $\mathbb{Z}$. Let $W(m,n) = \mathbb{P}(X_N = m)$. It is easy to see that $$W(m,n) = {N \choose (N+m)/2} \left( \frac{1}{2} \right)^N$$ and that the mean and std are $$\mathbb{E}[X_N] = 0, \quad \sigma^2_{X_N} = N$$Diffusion Coefficient Definition 6.2: (Diffusion coefficient). The diffusion coefficient $D$ is defined as ...

August 10, 2024 · 5 min · Hasith Vattikuti

5.4 - Gaussian Processes

Definition 5.9: A stochasitc process $\{X_t\}_{t \geq 0}$ is a Gaussian Process if its finite dimensional distributions are consistent Gaussian measures for any $0 \leq t_1 < t_2 < \ldots < t_k$. Recall that a Gaussian random vector $\mathbf{X} = (X_1, X_2,\ldots,X_n)^T$ is completely characterized by its first and second moments $$\mathbf{m} = \mathbb{E}[\mathbf{X}], \quad \mathbf{K} = \mathbb{E}[(\mathbf{X} - \mathbf{m}) (\mathbf{X} - \mathbf{m})^T]$$Meaning that the characteristic function is expressed only in terms of $\mathbf{m}$ and $\mathbf{K}$ ...

August 6, 2024 · 3 min · Hasith Vattikuti