INDEX
Explanations
mentions of a specific term 'Dot'
references to the game "DotA."
New Auto-Interp
Negative Logits
uberty
-0.82
eele
-0.76
ocene
-0.75
reditary
-0.69
WAYS
-0.67
éĹĺ
-0.65
iscopal
-0.65
aired
-0.65
ivals
-0.63
utherford
-0.63
POSITIVE LOGITS
Dot
1.02
dot
0.95
zeb
0.88
zh
0.78
lings
0.77
rice
0.74
sheet
0.70
forth
0.70
Designs
0.70
izens
0.70
Activations Density 0.006%