INDEX
Explanations
names of people or places with specific formatting and symbols
the character or symbol "Ŀ" in the text
New Auto-Interp
Negative Logits
referen
-0.80
awaru
-0.78
exhib
-0.68
unim
-0.68
memos
-0.67
wra
-0.67
chnology
-0.67
unborn
-0.66
ultras
-0.66
autobiography
-0.65
POSITIVE LOGITS
ļ
0.89
Ŀ
0.87
º
0.86
bryce
0.85
ÏĦ
0.83
SourceFile
0.83
¼
0.81
ï¸ı
0.80
Ĺ
0.80
taboola
0.79
Activations Density 0.138%