INDEX
Explanations
phrases related to transmission and spread
New Auto-Interp
Negative Logits
ipolar
-0.77
htaking
-0.70
rely
-0.68
ravel
-0.68
�
-0.67
xtap
-0.67
rodu
-0.66
eli
-0.65
athi
-0.64
ipel
-0.64
POSITIVE LOGITS
unwilling
0.84
adamant
0.66
ancies
0.61
reluctant
0.60
adoptive
0.59
cerv
0.57
doors
0.57
OOD
0.57
braces
0.57
ESSION
0.57
Activations Density 0.093%