INDEX
Explanations
definitions or mentions of dictionaries
references to dictionaries and encyclopedias
New Auto-Interp
Negative Logits
roach
-0.75
eem
-0.72
sent
-0.71
aints
-0.69
raising
-0.68
ded
-0.68
rd
-0.68
rity
-0.68
achine
-0.68
Wil
-0.66
POSITIVE LOGITS
dictionary
1.23
Dictionary
1.16
ãĥ¼ãĥĨãĤ£
0.98
ictionary
0.90
translator
0.86
ãĥķãĤ¡
0.83
ãĥĥãĤ¯
0.83
tymology
0.80
abulary
0.78
ãĤ¢ãĥ«
0.77
Activations Density 0.007%