INDEX
Explanations
mentions of locations or specific places
references to educational experiences and interactions with classmates
New Auto-Interp
Negative Logits
xit
-0.67
2020
-0.61
Hezbollah
-0.59
deterrence
-0.59
2024
-0.59
anwhile
-0.58
rivals
-0.57
worthiness
-0.55
threatens
-0.53
2025
-0.53
POSITIVE LOGITS
myself
1.03
my
0.91
haha
0.81
:)
0.77
:-)
0.71
My
0.67
blogging
0.66
my
0.66
nesday
0.66
Soup
0.65
Activations Density 2.092%