INDEX
Explanations
phrases that communicate simplification or generalization
New Auto-Interp
Negative Logits
kerap
-0.59
Uran
-0.59
";
-0.55
Marac
-0.55
particolarmente
-0.54
Repair
-0.53
repaired
-0.53
Discre
-0.53
Observed
-0.53
amigurumi
-0.53
POSITIVE LOGITS
basically
1.75
Basically
1.74
basically
1.68
Basically
1.66
Essentially
1.51
essentially
1.48
essentially
1.47
Essentially
1.43
básicamente
1.19
基本上
0.86
Activations Density 0.135%