INDEX
Explanations
personal pronouns and verbs related to communication
expressions of personal opinions or reflections on experiences
New Auto-Interp
Negative Logits
orem
-0.64
2022
-0.62
arithmetic
-0.60
urgently
-0.59
miracle
-0.58
atories
-0.57
Clause
-0.55
Eight
-0.54
2024
-0.54
inscribed
-0.54
POSITIVE LOGITS
kinda
1.07
alot
1.00
haha
0.96
maybe
0.82
doesnt
0.82
whats
0.81
laughs
0.81
Anyway
0.80
didnt
0.79
anyways
0.76
Activations Density 1.295%