INDEX
Explanations
phrases or words with "afoul"
words related to instances of falling or going astray
New Auto-Interp
Negative Logits
libel
-0.66
>[
-0.64
affinity
-0.59
ties
-0.58
enegger
-0.58
contribution
-0.57
attachment
-0.56
privilege
-0.55
Jav
-0.55
relocation
-0.55
POSITIVE LOGITS
efully
0.97
rift
0.95
ither
0.90
leep
0.86
oud
0.83
eper
0.79
isively
0.78
aaa
0.74
ross
0.74
retty
0.73
Activations Density 0.093%