INDEX
Explanations
the word "department"
references to various departments in a context related to institutions or organizations
New Auto-Interp
Negative Logits
telling
-0.78
Intent
-0.73
isers
-0.67
Mini
-0.65
iser
-0.64
lihood
-0.60
pak
-0.59
Instruments
-0.59
chers
-0.59
isks
-0.59
POSITIVE LOGITS
artment
0.97
al
0.92
alities
0.90
artments
0.89
utical
0.86
alse
0.85
ij士
0.83
naire
0.82
ality
0.82
istry
0.81
Activations Density 0.020%