INDEX
Explanations
instances of training and educational materials or programs
New Auto-Interp
Negative Logits
gle
-0.50
seemingly
-0.48
cop
-0.46
pec
-0.46
pon
-0.44
posi
-0.44
atto
-0.43
somewhat
-0.43
Pell
-0.42
inde
-0.42
POSITIVE LOGITS
TRAINING
1.03
TRAINING
1.02
Training
0.99
trainings
0.97
Training
0.96
training
0.96
training
0.90
treinamento
0.81
retraining
0.79
培训
0.74
Activations Density 0.011%