INDEX
Explanations
deportation, asylum, detention
New Auto-Interp
Negative Logits
pastel
0.47
Carl
0.45
Ta
0.45
작업
0.45
trabaj
0.44
Think
0.44
Sew
0.44
Christina
0.44
tầng
0.44
Yeast
0.44
POSITIVE LOGITS
deportation
0.86
deport
0.73
extradition
0.64
deported
0.63
Deport
0.57
detainees
0.57
asylum
0.53
repatriation
0.53
detention
0.52
Afghanistan
0.52
Activations Density 0.022%