INDEX
Explanations
phrases related to physical actions or events
instances of punctuation and certain significant phrases
New Auto-Interp
Negative Logits
usterity
-0.76
orer
-0.72
zbek
-0.66
OND
-0.66
aran
-0.66
obin
-0.64
NK
-0.63
TN
-0.61
nai
-0.61
ctor
-0.60
POSITIVE LOGITS
}"
0.67
FontSize
0.66
respectively
0.60
76561
0.59
Minotaur
0.58
decap
0.58
peg
0.57
diminishing
0.57
viz
0.57
rained
0.56
Activations Density 0.287%