INDEX
Explanations
numerical values within text
punctuation and structural markers in text
New Auto-Interp
Negative Logits
æ©
-0.97
hart
-0.94
Sag
-0.90
TAG
-0.85
455
-0.84
hern
-0.79
»Ĵ
-0.78
¥µ
-0.74
Stack
-0.73
Nost
-0.73
POSITIVE LOGITS
lee
0.92
Rae
0.87
DA
0.84
Rai
0.84
ael
0.80
978
0.79
da
0.77
aji
0.77
server
0.76
Server
0.75
Activations Density 0.364%