INDEX
Explanations
first-person singular pronouns followed by verbs expressing thoughts or emotions
New Auto-Interp
Negative Logits
meanwhile
-0.69
Lank
-0.64
Lys
-0.60
Hanson
-0.59
Locations
-0.58
Lanka
-0.58
Tracks
-0.58
Kamp
-0.56
eret
-0.56
;;;;;;;;
-0.56
POSITIVE LOGITS
Reviewer
0.95
suddenly
0.88
inexpl
0.84
apsed
0.81
unexpectedly
0.80
uddenly
0.80
mysteriously
0.78
Suddenly
0.75
iphany
0.72
amorph
0.72
Activations Density 0.773%