INDEX
Explanations
the characters 'Ļ' or 'Ċ'
negations or denials
New Auto-Interp
Negative Logits
mathemat
-0.78
successes
-0.69
Seym
-0.64
Imag
-0.64
Skydragon
-0.64
Gutenberg
-0.64
Nanto
-0.64
Lawn
-0.63
Scenes
-0.63
Companion
-0.63
POSITIVE LOGITS
ï¸ı
0.95
ï¸
0.91
sure
0.88
hip
0.87
agree
0.85
âĶĢ
0.83
yet
0.82
ðŁĺ
0.81
ski
0.80
le
0.80
Activations Density 0.213%