INDEX
Explanations
specific entities such as companies, organizations, teams, and divisions
terms related to specific entities or categories
New Auto-Interp
Negative Logits
hern
-0.63
wn
-0.59
Begin
-0.58
kidding
-0.57
Bere
-0.57
Hil
-0.56
sbm
-0.52
patience
-0.52
appre
-0.52
Gunn
-0.52
POSITIVE LOGITS
or
0.93
denomination
0.86
imaginable
0.84
combinations
0.83
group
0.80
group
0.80
affiliation
0.79
depending
0.79
category
0.77
locale
0.77
Activations Density 0.304%