INDEX
Explanations
non-English characters and potentially names or locations
special characters or non-standard symbols
New Auto-Interp
Negative Logits
vier
-0.71
Heath
-0.70
Starr
-0.69
ORED
-0.67
liness
-0.67
SPONSORED
-0.66
20439
-0.66
Daly
-0.65
bury
-0.65
Hastings
-0.65
POSITIVE LOGITS
Ŀ
1.38
Ð
1.35
¹
1.33
±
1.32
Ķ
1.29
ł
1.27
³
1.27
ļ
1.21
¡
1.21
Ĵ
1.19
Activations Density 0.002%