INDEX
Explanations
phrases related to specific people's names
mentions of specific individuals and large corporations
New Auto-Interp
Negative Logits
Haram
-0.70
encl
-0.67
Disclaimer
-0.63
Thumbnail
-0.62
Tablet
-0.62
galleries
-0.61
ibli
-0.61
âĿ
-0.61
animous
-0.61
FontSize
-0.60
POSITIVE LOGITS
hower
0.92
rend
0.85
clair
0.80
bach
0.80
chev
0.76
ORGE
0.74
perature
0.74
artz
0.74
baugh
0.72
agle
0.72
Activations Density 0.028%