This paper was converted on www.awesomepapers.org from LaTeX by an anonymous user.
Want to know more? Visit the Converter page.

Exploring Bias in over 100 Text-to-Image
Generative Models

Jordan Vice1
   Naveed Akhtar2
   Richard Hartley3,4
   Ajmal Mian1
   1 University of Western Australia 2 University of Melbourne
3 Australian National University 4 Google
Abstract

We investigate bias trends in text-to-image generative models over time, focusing on the increasing availability of models through open platforms like Hugging Face. While these platforms democratize AI, they also facilitate the spread of inherently biased models, often shaped by task-specific fine-tuning. Ensuring ethical and transparent AI deployment requires robust evaluation frameworks and quantifiable bias metrics. To this end, we assess bias across three key dimensions: (i) distribution bias, (ii) generative hallucination, and (iii) generative miss-rate. Analyzing over 100 models, we reveal how bias patterns evolve over time and across generative tasks. Our findings indicate that artistic and style-transferred models exhibit significant bias, whereas foundation models, benefiting from broader training distributions, are becoming progressively less biased. By identifying these systemic trends, we contribute a large-scale evaluation corpus to inform bias research and mitigation strategies, fostering more responsible AI development.

1 Introduction

Text-to-image (T2I) generative models, while capable of high-fidelity image synthesis, inherently reflect the biases present in their training data (Garcia et al., 2023; Mehrabi et al., 2021; Zhang et al., 2023). The wide accessibility of training, fine-tuning and deployment resources has resulted in a plethora of T2I models being published by AI practitioners and hobbyists alike. Whereas there are many debates on the biased nature of these models, there is no concrete evidence on how the community is responding in terms of accounting for bias in T2I generative models, particularly in light of the volume of models continuing to be released. Hence, we conduct this crucial research.

The abundance of publicly available data and models democratizes AI development, but also underscores the need for responsible usage (Arrieta et al., 2020; Bakr et al., 2023; Teo et al., 2024) and comprehensive evaluation tools that characterize bias characteristics of these black box models (Bakr et al., 2023; Chinchure et al., 2024; D’Incà et al., 2024; Hu et al., 2023; Luo et al., 2024; Vice et al., 2023). The ability to develop unsafe, inappropriate or biased models presents a significant challenge and evaluating fundamental bias characteristics is a crucial step in the right direction.

Biased representations in generated images stem from factors such as class imbalances in training data, human labeling biases, and hyperparameter choices during model training and fine-tuning (Garcia et al., 2023; Mehrabi et al., 2021; Zhang et al., 2023). Theoretically, generative model biases are not confined to a single concept or direction. Analyzing a model’s overall bias provides a more comprehensive understanding of its learned representations and underlying manifold structure. For instance, when generating generic images of “animals,” a model may disproportionately favor certain species or environments. While social biases (e.g., those related to age, race, or gender) are particularly consequential in public-facing applications (Abid et al., 2021; Luccioni et al., 2023; Naik & Nushi, 2023; Seshadri et al., 2023), they are manifestations of broader model biases, observed from a specific viewpoint. Since biases extend beyond social domains, it is essential to first characterize the general bias properties of learned concepts to better understand their implications.

In this work, we perform an extensive analysis of publicly available T2I models to examine how bias characteristics have evolved over time and across different generative tasks. We construct a comprehensive evaluation framework that considers: (i) distribution bias, (ii) Jaccard hallucination, (iii) generative miss-rate, (iv) log-based bias scores, (v) model popularity, and (vi) metadata features such as the intended generative task and timestamp.

Repositories such as the HuggingFace Hub offer a vast array of fine-tuned models, including approximately 56,240 text-to-image (T2I) models111as of the time of writing this manuscript. This extensive collection enables our comprehensive evaluations. The field of conditional image generation has evolved significantly, from the widely-used Stable Diffusion architecture (Rombach et al., 2022) (spanning versions v1 to v3/XL) to the latest rectified-flow transformer (FLUX)-based models (BlackForestLabs, 2024). To capture this progression, we conduct extensive evaluations across more than 100 unique models, varying in artistic style, generative task, and release date.

To quantify bias along distribution bias ‘BDB_{D}’, Jaccard hallucination ‘HJH_{J}’ and, generative miss-rate ‘MGM_{G}’ dimensions, we utilize the open-source “Try Before You Bias” (TBYB) evaluation code (Vice et al., 2023), which aligns well with models hosted on HuggingFace. We introduce a log-based bias score that integrates these metrics into a single, interpretable value, computable in black-box settings. This approach provides a unified framework for evaluating and comparing model biases.

Our evaluations offer valuable insights into the bias characteristics of various categories of generative models, revealing a trade-off between artistic style transfer and perceived bias. We also observe that modern foundation models and photo-realism models have benefited from larger datasets, improved architectures, and careful curation efforts, leading to a positive trend in bias mitigation over time. By analyzing model popularity, we further explore whether user engagement is influenced by bias. This study represents a significant step forward in understanding how the community responds to biases T2I models, particularly in light of the rapid proliferation of diverse models.

Through this work we contribute:

  1. 1.

    an extensive evaluation of bias trends in generative text-to-image models over time, uncovering key observations across three dimensions: distribution bias, hallucination and generative miss-rate.

  2. 2.

    a singular, log-based bias evaluation score that advances existing methodologies. This score enables end-to-end bias assessments in black-box settings, eliminating the need for normalization relative to a corpus of evaluated models.

  3. 3.

    a categorization and analysis of bias characteristics across several classes of trained and fine-tuned text-to-image models, namely: foundation, photo realism, animation, art. Additionally, we provide a quantifiable measure of model popularity, offering insights into how bias may influence user engagement and adoption.

2 Background and Related Work

Generative Text-to-Image Models have gained significant attention among AI practitioners and the wider, general public. These models, composed of tokenizers, text encoders, denoising networks, and schedulers, enable users to generate unique images from conditional prompts. The foundational de-noising process proposed by Sohl-Dickstein et al. (2015) inspired many of the underlying generative capabilities of modern T2I models. Subsequent advancements include denoising diffusion probabilistic models (DDPMs) (Sohl-Dickstein et al., 2015; Ho et al., 2020), denoising diffusion implicit models (DDIMs) (Song et al., 2020a), and stochastic differential equation (SDE)-based approaches (Song et al., 2020b). Rectified Flow-based de-noising paradigms have recently gained prominence, as seen in Stable Diffusion 3 (Esser et al., 2024), FLUX (BlackForestLabs, 2024) and PixArt(Chen et al., 2023; 2025).

These models often use a modified, conditional U-Net (Ronneberger & Fischer, 2015) for latent denoising. Conditional generative models integrate a network to convert user inputs into guidance vectors, steering the denoising process to match input prompts. In T2I models, Contrastive Language-Image Pre-training (CLIP) (Radford et al., 2021) and T5 encoders (Ni et al., 2021) are commonly used to map textual inputs into semantically rich embedding spaces. Larger models often combine multiple text encoders to enhance performance (Esser et al., 2024; BlackForestLabs, 2024).

By combining embedded denoising networks and text encoders, various T2I foundation models have been developed and released to the public. Notable examples include Stable Diffusion (v1 to v3.5/XL variants) (Rombach et al., 2022; Esser et al., 2024; Podell et al., 2023), DALL-E 2/3 (Betker et al., 2023; Saharia et al., 2022), and Imagen (Ramesh et al., 2022). Through cost-effective fine-tuning techniques like DreamBooth (Ruiz et al., 2023), Low-Rank Adaptation (LoRA) (Hu et al., 2021), and Textual Inversion (Gal et al., 2022), AI practitioners and hobbyists can create custom T2I models with tailored representations of learned concepts. However, these models are often shared on platforms like the HuggingFace Hub without sufficient acknowledgment of their potential biases, raising concerns about their responsible dissemination.

Bias and Ethical AI Evaluation Frameworks. Modern foundation models are trained on large, uncurated internet datasets, which often contain harmful, inaccurate, or biased representations that can manifest in generated outputs (Ferrara, 2023; Mehrabi et al., 2021). Unlike biased classification systems, bias in generative models is subtler and harder to detect due to their expansive input/output spaces and complex semantic relationships arising from massive training datasets. Without proper mitigation or quantification, these biases can lead to the proliferation of harmful stereotypes and misinformation. Compounded training and fine-tuning processes can thereby exacerbate or shift a model’s bias characteristics, raising ethical concerns, especially in front-facing applications. This underscores the critical need for bias quantification to address ethical AI considerations.

Several ethical AI evaluation frameworks have manifested as a result of these open research questions (Cho et al., 2023; Luccioni et al., 2023; Luo et al., 2024; Chinchure et al., 2024; Vice et al., 2023; Bakr et al., 2023; Hu et al., 2023; Teo et al., 2024; Gandikota et al., 2024; Huang et al., 2024; Schramowski et al., 2023; Seshadri et al., 2023; Naik & Nushi, 2023; D’Incà et al., 2024), addressing issues of fairness, bias, reliability and safety. While this work focuses primarily on biases, it is important to consider the synergy that exists across these four ethical AI dimensions. To conduct these evaluations, many works deploy auxiliary captioning or VLM/VQA models to facilitate the extraction of descriptive metrics.

The TIFA method introduced by (Hu et al., 2023) defines a comprehensive list of quantifiable T2I statistics, leveraging a VQA model to provide an extensive evaluation results on generated image and model characteristics. In a similar vein, the HRS benchmark proposed by (Bakr et al., 2023) also considers a wide range of T2I model characteristics - beyond the bias dimension, as it considers image quality and semantic alignment (scene composition). The StableBias (Luccioni et al., 2023) and DALL-Eval (Cho et al., 2023) methods have been proposed to assess reasoning skills and social biases (including gender/ethnicity) of text-to-image models, deploying captioning and VQA models for their analyses. Similarly, frameworks like FAIntbench (Luo et al., 2024), TIBET (Chinchure et al., 2024) and OpenBias (D’Incà et al., 2024) each consider he recognition of biases along several dimensions, proposing a wider definition of biases, all incorporating LLM and/or VQA models in their evaluation frameworks. FAIntbench considers four dimensions of bias i.e.: manifestation, visibility and acquired/protected attributes (Luo et al., 2024). In comparison, the TIBET framework identifies relevant, potential biases w.r.t. the input prompt (Chinchure et al., 2024). The ‘Try Before you Bias’ (TBYB) evaluation tool encompasses the evaluation methodology proposed by Vice et al. (2023), characterizing bias through: hallucination, distribution bias and generative miss-rate.

While evaluation frameworks are extensive, large-scale bias analysis of open-source, community-driven models remains limited. Existing efforts often focus on narrow subsets of models, leaving a critical need for a systematic, scalable approach. We bridge this gap with a comprehensive evaluation of over 100 models, utilizing the TBYB tool for its compatibility with the HuggingFace Hub.

3 Methodology

In this work, we conduct comprehensive bias evaluations of 103 unique T2I models that have been released from August 2022 to December 2024. To identify general bias characteristics, we employ the general bias evaluation methodology defined in Vice et al. (2023) to generate images of 100 random objects, (3 images/prompt = 300 images per evaluated model). This allows us to infer diverse, fundamental bias characteristics of each model.

3.1 Evaluation Metrics

Data biases can propagate into T2I models, leading to skewed representations in their outputs. Furthermore, compounded training and fine-tuning of large foundation models can fundamentally alter their bias characteristics. Regardless of intent, the severity of these biases must be quantifiable and must capture the diverse ways in which bias can manifest. To address these requirements, we employ three metrics for quantifying bias, motivated by fundamental examples that illustrate their relevance and applicability in evaluating model behavior.

(i) When prompted with “a picture of an apple”, a text-to-image model may generate an apple hanging off a tree. While semantically-logical, one could argue that generating the tree in the image evidences a hallucinated object in the scene (by addition) - as it was not explicitly requested in the prompt. Or, the model may generate an apple tree with no apples - omitting the object in the prompt. To account for both cases here, we compute Jaccard hallucination ‘HJH_{J}’, derived from the IoU.

(ii) Nation-𝐗\mathbf{X} commissions the development of a generative model for producing content for tourism with blended national flag iconography. The distribution of generated content would reflect the intentional skew by showing peaks in the number of occurrences for concepts relating to Nation-𝐗\mathbf{X}. Thus, we consider distribution bias ‘BDB_{D}’ as a quantifiable means of evaluating this phenomenon.

(iii) A T2I model has been fine-tuned with an intentionally-biased dataset that replaces images labeled with ‘car’ to ‘person’. This results in an intentionally-biased and misaligned output space that would cause misclassification w.r.t. the label provided by the input prompt. This justifies the need for quantifying generative miss-rate ‘MGM_{G}’.

Covering the underlying motivations of the above examples, we use HJH_{J}, BDB_{D} and MGM_{G} to analyze model bias. We also combine them into a single, log-based bias evaluation score ‘log\mathcal{B}_{\log}’ to characterize the overall bias behavior, which is useful for independently ranking different models. We visualize our bias evaluation framework in Fig. 1.

Refer to caption
Figure 1: Illustrating the process of quantifying biases in generative models in black-box settings. General prompts are used to query a test model. From the generated image set, we quantify bias along: (i) distribution bias, (ii) hallucination and, (iii) generative miss-rate dimensions.

Jaccard Hallucination - HJH_{J}. While usually discussed from the context of language models (Gunjal et al., 2023; Ji et al., 2023), hallucinations are a common side effect in many foundation models (Rawte et al., 2023). They have been proposed as a vehicle for image out-painting (Xiao et al., 2020) and generative model improvement (Li et al., 2022b; Xiao et al., 2020) tasks. When drawing representations of objects and and classes from a learned distribution, it is logical that the semantically-rich manifolds may cause a model to also generate semantically-relevant objects as a result.

Here, HJH_{J} considers two hallucination perspectives i.e.: (i) by addition of unspecified objects in the output and (ii) by omission of objects specified in the input. For a set of NN output images ‘YiiNY_{i}~{}\forall~{}i\in N’, generated from input prompts ‘𝐱iiN\mathbf{x}_{i}~{}\forall~{}i\in N

HJ=Σi=0N11𝒳i𝒴i𝒳i𝒴iN,H_{J}=\frac{\Sigma_{i=0}^{N-1}1-\frac{||\mathcal{X}_{i}\cap\mathcal{Y}_{i}||}{||\mathcal{X}_{i}\cup\mathcal{Y}_{i}||}}{N}, (1)

where ‘𝒳\mathcal{X}’ defines input objects extracted from 𝐱i\mathbf{x}_{i} and ‘𝒴\mathcal{Y}’ defines the objects detected in the output image YiY_{i}, extracted from a generated caption. HJ0H_{J}\rightarrow 0 indicates a smaller discrepancy between the input and output objects/concepts and thus, demonstrates less hallucinatory (biased) behavior.

Distribution Bias - BDB_{D} is derived from the area under the curve (AuC) of detected objects, capturing the frequency of objects/concepts that appear in generated images (that were not specified in the prompt) (Vice et al., 2023). After generating images and filtering objects, an object token dictionary ‘WO={wi,ni}i=1MW_{O}=\{w_{i},n_{i}\}_{i=1}^{M}’ is constructed, containing concept (word) ‘wiw_{i}’ and no. of occurrences ‘nin_{i}’ pair. The distribution bias BDB_{D} can be calculated through the AuC, after sorting WOW_{O} (high to low) and applying min-max normalization:

{wi,ni~}={wi,nimini=1,,M(n[WO])maxi=1,,M(n[WO])mini=1,,M(n[WO])},\displaystyle\{w_{i},\tilde{n_{i}}\}=\{w_{i},\frac{n_{i}-\underset{i=1,...,M}{\min}(n~{}\in~{}[W_{O}])}{\underset{i=1,...,M}{\max}(n~{}\in~{}[W_{O}])-\underset{i=1,...,M}{\min}(n~{}\in~{}[W_{O}])}\}, (2)
BD=Σi=1Mn~i+n~i+12.\displaystyle B_{D}=\Sigma_{i=1}^{M}\frac{\tilde{n}_{i}+\tilde{n}_{i+1}}{2}. (3)

Peaks in generated object distributions may report that significant attention is being applied along a specific bias direction and thus, represents another avenue in which bias can manifest itself.

Generative Miss Rate - MGM_{G}. Bias can affect model performance, particularly if they shift the output representations in such a way that causes significant misalignment (Vice et al., 2023). As visualized in Fig. 1, a separate vision transformer (ViT) is deployed to classify generated images and determine MGM_{G}. Generally, model alignment should be high and thus, the miss-rate should demonstrate a low variance across models. Significantly high MGM_{G} may indicate that a model’s learned biases are shifting output representations away from the expected output (as governed by the prompt). For models trained to complete specific tasks (like generate a particular art style), we may find that the miss rate is much higher, potentially by design.

Given a prompt (classifier target label) ‘𝐱\mathbf{x}’ and generated image YY, the deployed ViT outputs a prediction, measuring the alignment of the image ‘YY’ to the label ‘𝐱\mathbf{x}’. For NN generated images,

MG=Σi=0N1(𝒫1=p(Yi;θ))N,M_{G}=\frac{\Sigma_{i=0}^{N-1}(\mathcal{P}_{1}=p(Y_{i};\theta))}{N}, (4)

where 𝒫1\mathcal{P}_{1} represents ¬\negtarget class. If the classifier fails to detect the generated image as a valid representation of 𝐱\mathbf{x} then MGM_{G} increases. A higher MGM_{G} indicates a greater misalignment with input prompts which may be (a) a symptom of a biased output space and/or (b) the result of a task that causes significant changes in output representations. We visualize how BD,HJB_{D},H_{J} and MGM_{G} manifest in the output representations of these models in Fig. 2.

The Try Before You Bias (TBYB) Tool is a publicly available, practical software implementation of the three-dimensional bias evaluation framework discussed prior. The TBYB interface allows users to evaluate T2I models hosted on the HuggingFace Hub in a black-box evaluation set-up, provided repositories contain a model_index.json file. The BLIP (Li et al., 2022a) model is deployed for image captioning. Synonym detection functions in the NLTK (Bird & Loper, 2009) package are deployed to mitigate natural language discrepancies between the input prompt and generated caption.

Refer to caption
Figure 2: Qualitative examples of how bias characteristics are presented in T2I model outputs. For each metric, we choose examples of high and low performing models, reporting the corresponding evaluation results (for all generated images) in the parentheses. Every image is generated from a unique model to show different examples. Input prompt = “A picture of an apple on a table”.

3.2 Systematic Bias Evaluation Strategy

Based on the generated outputs and model metadata, we identify the model type as one of {foundation, photo realism, art (¬\neganime) and animation/anime}. We define Foundation models as those designed for general purposes, encompassing a wider range of tasks. Photo realism models are those that are fine-tuned for higher-fidelity, photo realistic generation tasks. Art-based models are those which have been designed for style-transfer tasks in which non-anime artistic styles are the target. Animation/anime-tuned models are designed for replicating anime-inspired art-styles, a common application of models hosted on HuggingFace.

For time-based evaluations, we construct a timeline spanning from August 2022 to December 2024, analyzing trends across various model types. We then extrapolate these trends to understand how different categories of models are evolving. As part of this analysis, we investigate whether larger, more sophisticated foundation models, such as Stable Diffusion 3/XL, have achieved better alignment, reduced hallucinations, and fairer distributions of generated objects. Additionally, we provide a detailed analysis of each model type and explore the relationship between model popularity and bias statistics. Finally, we conduct bias evaluations across different noise schedulers to identify potential bias behaviors associated with their deployment.

In this work, we improve on the similarity detection function in (Vice et al., 2023) by incorporating a similarity-score-based approach to handle similar concepts e.g. ‘sneakers’ vs. ‘shoes’. Additionally, we omit commonly occurring primary (red, blue, yellow), secondary (green, orange, purple) and neutral colors (black, white, brown, grey) from generated captions as it was found in our analyses that color descriptions are not a fool-proof symptom of hallucination and can adversely skew results in a lot of cases. Furthermore, we propose combining the three metrics into one singular bias score, using a log scale to account for varied metric ranges, such that:

log=(ln(BD)+ln(1HJ)+ln(1MG)),\mathcal{B}_{\log}=-(\ln{(B_{D})}+\ln{(1-H_{J})}+\ln{(1-M_{G})}), (5)

where a proportional relationship exists between observed model bias and log\mathcal{B}_{log}. This allows for the calculation of biases for a single model, in a black-box setup, without relying on normalized relationships to a set of evaluated models as initially proposed in (Vice et al., 2023).

Model Popularity. As part of our analysis, we aim to analyze the relationship (if any) between model popularity and bias. To quantify model popularity, we designed a quantifiable score ‘𝒮pop.\mathcal{S}_{pop.}’, leveraging reported engagement information on the HuggingFace Hub i.e., the number of likes (historical) ‘NlkN_{lk}’, and the number of downloads ‘NdlN_{dl}’ in the last month (recent engagement). Given that the number of likes is generally less the number of downloads, we apply logarithmic scaling and proportional scaling factors ‘αlk\alpha_{lk}’ and ‘αdl\alpha_{dl}’, to account for the importance of continued engagement (NlkN_{lk}) and mitigate spikes in NdlN_{dl} associated with recency bias. Thus, we define:

𝒮pop.=αlkln(1+Nlk)+αdlln(1+Ndl),\mathcal{S}_{pop.}=\alpha_{lk}\ln(1+N_{lk})+\alpha_{dl}\ln(1+N_{dl}), (6)

where we deploy αlk=0.6\alpha_{lk}=0.6 and αdl=0.4\alpha_{dl}=0.4 in our experiments to account slightly more for historical influence while managing recency bias, such that 𝒮pop.=0.6ln(1+Nlk)+0.4ln(1+Ndl)\mathcal{S}_{pop.}=0.6\ln(1+N_{lk})+0.4\ln(1+N_{dl}).

4 Results and Discussion

Our appraisal of the general bias characteristics of text-to-image models allows us to conduct a suite of evaluation studies to explore and formalize relationships between observed biases and model characteristics. Temporal-, categorical- and popularity-based analyses allow us to identify how bias characteristics: (i) have evolved over time, (ii) change with respect to different generative tasks or embedded de-noising schedulers and, (iii) impact how users engage with these models.

High-level Observations of General Bias Characteristics. We report a truncated list of evaluation results in Table 1, highlighting models that exhibit high, low and median bias behavior. Along with these, we also report results for highly-popular foundation models like the various stable diffusion versions. Analyzing Table 1 and Figs. 2, 3, we can observe that photo-realism and foundation models tend to generate relatively unbiased representations, which is expected given that these models are designed for general user inference tasks and improvements in generative fidelity - as is the case with photo-realism models. In comparison, at the bottom of Table 1, we can observe that many animation and art-tuned models report relatively more biased behavior, resulting from the task-oriented generative tasks. Observing the outputs of these models, we found that the tendency to focus on generating specific characters or art-styles irrespective of the prompt, resulted in high levels of hallucination and misalignment (see Figs. 2, 3).

Table 1: Truncated bias evaluation results. For brevity, we report the highest, median and lowest evaluation results. We also report results for highly-popular stable diffusion foundation models. We indicate row-wise separation of results via ‘:’. We also report popularity score ‘𝒮pop.\mathcal{S}_{pop.}’. We highlight “most desirable” and “least desirable” values in green and red, respectively. Cells highlighted in orange indicate values closest to the average. A full list of results is provided in Appendix A.
Model Task Category Denoiser Resolution Release (dd/mm/yy) 𝒮pop.\mathcal{S}_{pop.} BDB_{D} HJH_{J} MGM_{G} BlogB_{\log}
Envvi/Inkpunk-Diffusion art PNDMScheduler 512 x 512 25/11/22 7.2323 18.9000 0.5346 0.0033 -2.1711
Yntec/beLIEve photo realism DPMSolverMultistepScheduler 768 x 768 01/08/24 5.2547 17.6176 0.5083 0.0000 -2.1589
segmind/SSD-1B photo realism EulerDiscreteScheduler 1024 x 1024 19/10/23 6.7116 15.7000 0.4747 0.0000 -2.1098
RunDiffusion/Juggernaut-X-v10 foundation EulerDiscreteScheduler 1024 x 1024 20/04/24 6.8125 16.3571 0.4992 0.0000 -2.1031
prompthero/openjourney-v4 photo realism PNDMScheduler 512 x 512 12/12/22 8.4414 15.9211 0.4881 0.0000 -2.0981
Lykon/dreamshaper-8 photo realism DEISMultistepScheduler 512 x 512 27/08/23 6.8769 17.3947 0.5467 0.0000 -2.0649
RunDiffusion/Juggernaut-XL-v9 foundation DDPMScheduler 1024 x 1024 19/02/24 8.6025 14.4048 0.4847 0.0000 -2.0046
stabilityai/sd-turbo foundation EulerDiscreteScheduler 512 x 512 28/11/23 7.9498 14.5476 0.4930 0.0000 -1.9982
eienmojiki/Anything-XL animation / anime EulerAncestralDiscreteScheduler 1024 x 1024 11/03/24 5.6000 14.8333 0.5287 0.0000 -1.9446
stabilityai/stable-diffusion-3.5-medium foundation FlowMatchEulerDiscreteScheduler 1024 x 1024 29/10/24 8.2481 14.0455 0.5049 0.0000 -1.9393
MirageML/dreambooth-nike photo realism PNDMScheduler 512 x 512 01/11/22 3.3402 14.4048 0.5206 0.0000 -1.9323
dataautogpt3/ProteusV0.3 foundation EulerDiscreteScheduler 1024 x 1024 13/02/24 7.3949 14.5833 0.5324 0.0000 -1.9196
stablediffusionapi/juggernaut-reborn foundation PNDMScheduler 512 x 512 21/01/24 5.0168 13.9783 0.5073 0.0167 -1.9129
: : : : : : : : : :
stabilityai/stable-diffusion-3.5-large foundation FlowMatchEulerDiscreteScheduler 1024 x 1024 22/10/24 9.2260 11.5769 0.4939 0.0000 -1.7680
: : : : : : : : : :
stabilityai/stable-diffusion-2-1 foundation DDIMScheduler 512 x 512 07/12/22 10.5860 11.7963 0.5349 0.0100 -1.6921
: : : : : : : : : :
CompVis/stable-diffusion-v1-4 foundation PNDMScheduler 512 x 512 20/08/22 10.7885 11.7258 0.5621 0.0000 -1.6360
: : : : : : : : : :
lemon2431/toonify_v20 animation / anime PNDMScheduler 512 x 512 16/10/23 4.1148 10.9412 0.5469 0.0033 -1.5976
SG161222/RealVisXL_V4.0 photo realism EulerDiscreteScheduler 1024 x 1024 13/02/24 8.6080 10.6250 0.5405 0.0000 -1.5856
ckpt/anything-v4.5 art PNDMScheduler 512 x 512 19/01/23 5.8262 11.5968 0.5784 0.0033 -1.5837
emilianJR/chilloutmix_NiPrunedFp32Fix photo realism PNDMScheduler 512 x 512 19/04/23 6.0849 10.7941 0.5498 0.0000 -1.5810
Kernel/sd-nsfw photo realism PNDMScheduler 512 x 512 15/07/23 5.5154 11.5333 0.5828 0.0000 -1.5711
Lykon/AAM_XL_AnimeMix animation / anime EulerDiscreteScheduler 1024 x 1024 19/01/24 6.8090 9.0152 0.4843 0.0000 -1.5367
stablediffusionapi/realistic-stock-photo photo realism EulerDiscreteScheduler 1024 x 1024 22/10/23 5.1367 10.2188 0.5451 0.0000 -1.5366
GraydientPlatformAPI/comicbabes2 art PNDMScheduler 512 x 512 07/01/24 4.0530 10.0313 0.5447 0.0033 -1.5156
SG161222/Realistic_Vision_V6.0_B1_noVAE photo realism PNDMScheduler 896 x 896 29/11/23 7.7733 11.1571 0.5957 0.0000 -1.5066
scenario-labs/juggernaut_reborn photo realism DPMSolverMultistepScheduler 512 x 512 29/05/24 4.5414 10.0294 0.5504 0.0000 -1.5061
: : : : : : : : : :
stabilityai/stable-diffusion-xl-base-1.0 photo realism EulerDiscreteScheduler 1024 x 1024 25/07/23 11.1360 8.6515 0.5076 0.0000 -1.4492
: : : : : : : : : :
stabilityai/sdxl-turbo foundation EulerAncestralDiscreteScheduler 512 x 512 27/11/23 10.1726 8.0778 0.5493 0.0000 -1.2922
: : : : : : : : : :
dataautogpt3/ProteusV0.4-Lightning foundation EulerDiscreteScheduler 1024 x 1024 22/02/24 6.2324 4.6739 0.4055 0.0000 -1.0220
SG161222/Realistic_Vision_V2.0 photo realism PNDMScheduler 512 x 512 21/03/23 8.3000 6.1154 0.5480 0.0000 -1.0167
sd-community/sdxl-flash foundation DPMSolverSinglestepScheduler 1024 x 1024 19/05/24 7.2798 4.7826 0.4424 0.0000 -0.9809
Mitsua/mitsua-diffusion-cc0 art PNDMScheduler 512 x 512 22/12/22 5.1305 10.8947 0.7426 0.0900 -0.9368
OnomaAIResearch/Illustrious-xl-early-release-v0 animation / anime EulerDiscreteScheduler 1024 x 1024 20/09/24 7.3851 11.6000 0.7524 0.1700 -0.8688
digiplay/ZHMix-Dramatic-v2.0 animation / anime EulerDiscreteScheduler 768 x 768 03/12/23 4.7306 5.7800 0.6506 0.0333 -0.6689
DGSpitzer/Cyberpunk-Anime-Diffusion animation / anime PNDMScheduler 704 x 704 28/10/22 6.7796 2.9722 0.5845 0.0600 -0.1492
Emanon14/NONAMEmix_v1 animation / anime EulerAncestralDiscreteScheduler 1024 x 1024 23/11/24 5.1821 5.3400 0.7447 0.1700 -0.1237
Onodofthenorth/SD_PixelArt_SpriteSheet_Generator art PNDMScheduler 512 x 512 01/11/22 6.4996 6.5930 0.7898 0.3367 0.0841
Niggendar/duchaitenPonyXLNo_ponyNoScoreV40 art EulerDiscreteScheduler 1024 x 1024 01/06/24 5.1912 3.1620 0.7458 0.1000 0.3237
lambdalabs/sd-pokemon-diffusers animation / anime PNDMScheduler 512 x 512 16/09/22 6.3118 9.6509 0.8639 0.6033 0.6519
Raelina/Raehoshi-illust-XL-3 animation / anime EulerDiscreteScheduler 1024 x 1024 11/12/24 3.7427 4.8772 0.8926 0.6700 1.7549
monadical-labs/minecraft-skin-generator-sdxl animation / anime EulerDiscreteScheduler 768 x 768 19/02/24 5.3317 5.9786 0.9721 0.7933 3.3680
Refer to caption
Figure 3: Bias evaluations across 103 publicly-available text-to-image model released between August 2022 to December 2024. We report (a) Distribution bias BDB_{D} evaluations (b) Jaccard Hallucination HJH_{J} evaluations, (c) Generative miss rate MGM_{G} evaluations. ‘M_XXX’ labels indicate the model ID, which is sorted from M_001 (earliest release) to M_103 (latest release).

Figure 2 presents a qualitative overview of bias manifestations in model outputs, using examples from Table 1 to contrast biased and unbiased behaviors. These results align with quantitative metrics: a higher average MGM_{G} will generally demonstrate a greater semantic misalignment (e.g. lambdalabs/sd-pokemon-diffusers). Low BDB_{D} models show constrained diversity or representational bias. Changes in HJH_{J} are straightforward, reflecting disparities between input and generated objects.

The varying scales of the three metrics necessitate a logarithmic scale for comparing overall model bias. Each metric uniquely characterizes bias. Table 1 shows that low-bias models (log\downarrow\mathcal{B}{\log}) typically report BD14.0B_{D}\geq 14.0, indicating a fairer distribution of generated objects. In contrast, highly biased models (log1.0\mathcal{B}{\log}\geq-1.0) report BD7.0B_{D}\leq 7.0, which suggests outliers or peaks in the output distribution. For HJH_{J}, T2I models inherently hallucinate due to their semantically rich latent spaces. The average HJ0.55H_{J}\approx 0.55 implies a 55% IoU between prompted and generated objects. Foundation and photo-realism models cluster near the mean, whereas highly biased models exhibit extreme values, with a maximum HJ=0.9721H_{J}=0.9721 meaning just 2.79% correlation between the input and output for the biased model. MGM_{G} remains low across most models, with a mean (MG=0.0333M_{G}=0.0333) near the minimum (MG=0.0000M_{G}=0.0000), indicating valid outputs \approx97% of the time despite hallucinations. Models with high MGM_{G} (0.60\geq 0.60) exhibit misaligned behavior, which based on model design may be intentional.

Evolution of Biases over Time. The release of the seminal latent diffusion work (Rombach et al., 2022), culminating in the public availability of the popular stable diffusion architecture on August 22, 2022, marked a pivotal moment in text-to-image generative models. Its launch on the HuggingFace Hub and subsequent community engagement spurred significant advancements in foundation models and task-specific variants. Accordingly, we use August 2022 as the starting point for our time-based analyses, with the latest evaluated model released in December 2024.

Our evaluation spans 103 models over 28 months, presenting time-based bias analyses by individual metrics (Fig. 3) and model categories (Fig. 4). The timeline (08/2212/2408/22\rightarrow 12/24) is consistent across sub-figures, with models grouped by task categories to examine trends. Bias trends, such as the steep increase in art and animation models’ bias over time (Fig. 4(e)), highlight the impact of hobbyists and practitioners embedding stylistic preferences or specific characters into these models. These intentional biases are reflected in their outputs, as supported by observations in Fig. 3.

In comparison, we see that models associated with general tasks i.e., those that belong to the foundation and photo-realism model categories, have maintained consistent if not lower bias characteristics over time, as is the case with foundation models (See Fig. 3 and 4(e)). The increase in training data sizes and conscious improvements made to human-labeling and captioning in general have resulted in wider and denser manifolds with a greater diversity of concept representations. Significantly, looking at Table 1, if we compare stable diffusion v1.4/2.1/3.5 rows, we can see that hallucination and distribution bias scores improve with each significant version upgrade through time.

Refer to caption
Figure 4: Categorized temporal trends in log\mathcal{B}_{\log} model biases, spanning from 08/2022 \rightarrow 12/2024. Dotted lines indicate linear trends, highlighted (and extrapolated to 01/2026) in (e).
Table 2: Observed mean and (standard deviation) across model categories. Column-wise bold values indicate the most biased behavior. We sort this table along the 𝒮pop.\mathcal{S}_{pop.} column in descending order. Arrows in each column indicate the direction in which observed biases increases.
Model Type 𝒮pop.\mathcal{S}_{pop.} \downarrow BDB_{D} \uparrow HJH_{J} \uparrow MGM_{G} \uparrow log\mathcal{B}_{\log}
Foundation 7.2033 (±\pm1.670) 10.7919 (±\pm3.195) 0.5175 (±\pm0.038) 0.0019 (±\pm0.005) -1.5960 (±\pm0.308)
Photo Realism 6.4354 (±\pm1.612) 11.1097 (±\pm2.979) 0.5392 (±\pm0.030) 0.0004 (±\pm0.001) -1.5949 (±\pm0.292)
Art 6.2180 (±\pm1.341) 10.0088 (±\pm4.135) 0.6159 (±\pm0.101) 0.0537 (±\pm0.107) -1.1581 (±\pm0.786)
Animation/Anime 6.0786 (±\pm1.367) 10.4494 (±\pm3.147) 0.5969 (±\pm0.120) 0.0830 (±\pm0.201) -1.1503 (±\pm1.134)

On the Influence of Model Type and Popularity. We conducted an evaluation of biases w.r.t. model categories and their popularity, exploiting Eq. (6) to quantify the latter. We report the results of these findings in Table 2 and observe that foundation and photo-realism models are on-average, the most popular for users. Interestingly, these models tend to have more unbiased output representations when we consider the quantitative findings. Additionally, by analyzing the log\mathcal{B}_{\log} standard deviation results in Table 2, we see that foundation and photo-realism model performances are typically more consistent than art/animation counterparts.

De-noising Scheduler-Dependent Bias Evaluation. Much of the conditional latent diffusion process is predicated on the deployed de-noising scheduler. While similarities exist across different families of schedulers and the task remains the same i.e. use a conditional vector to guide latent de-noising steps to generate an aligned image representation of the input prompt, the mathematical foundations of each scheduler is unique. We report the descriptive statistics of different schedulers in Table 3, highlighting eight scheduler categories. The FlowMatchEulerDiscrete scheduler is deployed in Stable diffusion 3 variants which points to its high popularity and low log\mathcal{B}_{\log} score. Recently, flow-based de-noising schedulers have gained increased attention in state-of-the-art T2I models like Stable Diffusion 3 and FLUX (Esser et al., 2024; BlackForestLabs, 2024).

In comparison, the EulerDiscrete scheduler (Karras et al., 2022) reports the largest bias and the highest average miss-rate. Incremental improvements in scheduler architectures and technology since the release of EulerDiscrete, along with the modern T2I models opting for modern schedulers are logical reasons as to why this scheduler reports significantly high bias scores. Similarly, the EulerAncestralDiscrete scheduler, which contributes “ancestral sampling” also boasts consistent performance with its predecessor. These seminal works have inspired improvements which as shown through the FlowMatchEulerDiscrete scheduler, have resulted in significant performance gains.

We note that while using quantifiable metrics like those reported here present a step in the right direction, any definitive correlations will require a deeper analysis into the schedulers themselves.

Table 3: Observed mean and (standard deviation) across deployed schedulers. Column-wise bold values indicate the most biased behavior. We sort the schedulers along the 𝒮pop.\mathcal{S}_{pop.} in descending order. Arrows in each column indicate the direction in which observed bias increases.
Scheduler 𝒮pop.\mathcal{S}_{pop.} \downarrow BDB_{D} \uparrow HJH_{J} \uparrow MGM_{G} \uparrow log\mathcal{B}_{\log}
FlowMatchEulerDiscrete 7.5101 (±\pm2.181) 12.2788 (±\pm1.541) 0.4959 (±\pm0.008) 0.0000 (±\pm0.000) -1.8177 (±\pm0.106)
KDPM2AncestralDiscrete 7.4314 (±\pm0.424) 9.5803 (±\pm2.667) 0.5279 (±\pm0.010) 0.0000 (±\pm0.000) -1.4894 (±\pm0.304)
DDIM/DDPM 7.3371 (±\pm1.973) 10.1441 (±\pm2.548) 0.5321 (±\pm0.039) 0.0067 (±\pm0.014) -1.5185 (±\pm0.261)
EulerAncestralDiscrete 7.1003 (±\pm1.667) 9.1328 (±\pm3.083) 0.5447 (±\pm0.084) 0.0213 (±\pm0.060) -1.3332 (±\pm0.552)
DEISMultistep 6.6288 (±\pm0.357) 12.8746 (±\pm3.953) 0.5713 (±\pm0.025) 0.0067 (±\pm0.012) -1.6710 (±\pm0.341)
PNDM 6.3172 (±\pm1.584) 11.1714 (±\pm3.002) 0.5694 (±\pm0.072) 0.0302 (±\pm0.106) -1.4658 (±\pm0.547)
EulerDiscrete 6.2381 (±\pm1.392) 10.3751 (±\pm3.573) 0.5702 (±\pm0.124) 0.0621 (±\pm0.190) -1.2184 (±\pm1.175)
DPMSolverMultistep 4.9966 (±\pm0.603) 12.1618 (±\pm3.654) 0.5662 (±\pm0.046) 0.0042 (±\pm0.008) -1.6255 (±\pm0.360)

5 Conclusion

We have conducted an extensive evaluation of text-to-image models, utilizing the open HuggingFace Hub to facilitate our analyses of the bias characteristics of 103 unique models. To improve on existing evaluation methodologies, we combine three independent metrics i.e., (i) distribution bias, (ii) Jaccard hallucination and, (iii) generative miss-rate into a single log-scaled metric. By accounting for various generative model categories and quantifying public engagement, we have presented a comprehensive set of model evaluations. Identifying the fundamental bias characteristics of large, publicly-available text-to-image models is a critical task that must be considered in a democratized AI environment, considering the exposure of these models to wider audiences that continues to grow over time. So, to answer the question of “are models more biased now than they were 3 years ago?” really depends on the task. We see that iterative releases of Stable diffusion models for example, have resulted in marginal improvements in bias characteristics over time (from SD 1.1 to 3.5). Foundation and photo-realism models have demonstrated significant reductions in hallucination and increases in alignment, which is beneficial for improving the reliability for a wider range of audiences. As it pertains to style-transferred, art and animation models, these have demonstrated increased bias characteristics - a byproduct of intentionally-designing models to achieve specific tasks. We hope this work inspires further research in the field and a greater exposure to bias evaluation efforts.

Acknowledgments

This research and Dr. Jordan Vice are supported by the NISDRG project #20100007, funded by the Australian Government. Dr. Naveed Akhtar is a recipient of the ARC Discovery Early Career Researcher Award (project #DE230101058), funded by the Australian Government. Professor Ajmal Mian is the recipient of an ARC Future Fellowship Award (project #FT210100268) funded by the Australian Government.

References

  • Abid et al. (2021) Abubakar Abid, Maheen Farooqi, and James Zou. Persistent anti-muslim bias in large language models. In Proceedings of the 2021 AAAI/ACM Conference on AI, Ethics, and Society, AIES ’21, pp.  298–306, 2021. ISBN 9781450384735. doi: 10.1145/3461702.3462624. URL https://doi.org/10.1145/3461702.3462624.
  • Arrieta et al. (2020) Alejandro Barredo Arrieta, Natalia Díaz-Rodríguez, Javier Del Ser, Adrien Bennetot, Siham Tabik, Alberto Barbado, Salvador García, Sergio Gil-López, Daniel Molina, Richard Benjamins, et al. Explainable artificial intelligence (xai): Concepts, taxonomies, opportunities and challenges toward responsible ai. Information fusion, 58:82–115, 2020.
  • Bakr et al. (2023) Eslam Mohamed Bakr, Pengzhan Sun, Xiaoqian Shen, Faizan Farooq Khan, Li Erran Li, and Mohamed Elhoseiny. Hrs-bench: Holistic, reliable and scalable benchmark for text-to-image models. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), pp.  20041–20053, October 2023.
  • Betker et al. (2023) James Betker, Gabriel Goh, Li Jing, Tim Brooks, Jianfeng Wang, Linjie Li, Long Ouyang, Juntang Zhuang, Joyce Lee, Yufei Guo, et al. Improving image generation with better captions. Computer Science., 2(3):8, 2023.
  • Bird & Loper (2009) Steven Bird and Ewan Loper, Edward Klein. Natural language processing with python, o’reilly media inc. https://github.com/nltk/nltk, 2009.
  • BlackForestLabs (2024) BlackForestLabs. Flux.1. https://huggingface.co/black-forest-labs/FLUX.1-schnell, 2024.
  • Chen et al. (2023) Junsong Chen, Jincheng Yu, Chongjian Ge, Lewei Yao, Enze Xie, Yue Wu, Zhongdao Wang, James Kwok, Ping Luo, Huchuan Lu, and Zhenguo Li. Pixart-α\alpha: Fast training of diffusion transformer for photorealistic text-to-image synthesis, 2023. URL https://arxiv.org/abs/2310.00426.
  • Chen et al. (2025) Junsong Chen, Chongjian Ge, Enze Xie, Yue Wu, Lewei Yao, Xiaozhe Ren, Zhongdao Wang, Ping Luo, Huchuan Lu, and Zhenguo Li. Pixart-σ\sigma: Weak-to-strong training of diffusion transformer for 4k text-to-image generation. In European Conference on Computer Vision, pp.  74–91. Springer, 2025.
  • Chinchure et al. (2024) Aditya Chinchure, Pushkar Shukla, Gaurav Bhatt, Kiri Salij, Kartik Hosanagar, Leonid Sigal, and Matthew Turk. Tibet: Identifying and evaluating biases in text-to-image generative models. In Aleš Leonardis, Elisa Ricci, Stefan Roth, Olga Russakovsky, Torsten Sattler, and Gül Varol (eds.), Computer Vision – ECCV 2024, pp.  429–446, Cham, 2024. Springer Nature Switzerland. ISBN 978-3-031-72986-7.
  • Cho et al. (2023) Jaemin Cho, Abhay Zala, and Mohit Bansal. Dall-eval: Probing the reasoning skills and social biases of text-to-image generation models. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pp.  3043–3054, October 2023.
  • D’Incà et al. (2024) Moreno D’Incà, Elia Peruzzo, Massimiliano Mancini, Dejia Xu, Vidit Goel, Xingqian Xu, Zhangyang Wang, Humphrey Shi, and Nicu Sebe. Openbias: Open-set bias detection in text-to-image generative models. arXiv preprint arXiv:2404.07990, 2024.
  • Esser et al. (2024) Patrick Esser, Sumith Kulal, Andreas Blattmann, Rahim Entezari, Jonas Müller, Harry Saini, Yam Levi, Dominik Lorenz, Axel Sauer, Frederic Boesel, Dustin Podell, Tim Dockhorn, Zion English, Kyle Lacey, Alex Goodwin, Yannik Marek, and Robin Rombach. Scaling rectified flow transformers for high-resolution image synthesis, 2024. URL https://arxiv.org/abs/2403.03206.
  • Ferrara (2023) Emilio Ferrara. Should chatgpt be biased? challenges and risks of bias in large language models. arXiv preprint arXiv:2304.03738, 2023.
  • Gal et al. (2022) Rinon Gal, Yuval Alaluf, Yuval Atzmon, Or Patashnik, Amit H. Bermano, Gal Chechik, and Daniel Cohen-Or. An image is worth one word: Personalizing text-to-image generation using textual inversion, 2022. URL https://arxiv.org/abs/2208.01618.
  • Gandikota et al. (2024) Rohit Gandikota, Hadas Orgad, Yonatan Belinkov, Joanna Materzyńska, and David Bau. Unified concept editing in diffusion models. In Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (WACV), pp.  5111–5120, January 2024.
  • Garcia et al. (2023) Noa Garcia, Yusuke Hirota, Yankun Wu, and Yuta Nakashima. Uncurated image-text datasets: Shedding light on demographic bias. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp.  6957–6966, June 2023.
  • Gunjal et al. (2023) Anisha Gunjal, Jihan Yin, and Erhan Bas. Detecting and preventing hallucinations in large vision language models. arXiv preprint arXiv:2308.06394, 2023.
  • Ho et al. (2020) Jonathan Ho, Ajay Jain, and Pieter Abbeel. Denoising diffusion probabilistic models. Advances in neural information processing systems, 33:6840–6851, 2020.
  • Hu et al. (2021) Edward J. Hu, Yelong Shen, Phillip Wallis, Zeyuan Allen-Zhu, Yuanzhi Li, Shean Wang, Lu Wang, and Weizhu Chen. Lora: Low-rank adaptation of large language models, 2021. URL https://arxiv.org/abs/2106.09685.
  • Hu et al. (2023) Yushi Hu, Benlin Liu, Jungo Kasai, Yizhong Wang, Mari Ostendorf, Ranjay Krishna, and Noah A. Smith. Tifa: Accurate and interpretable text-to-image faithfulness evaluation with question answering. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), pp.  20406–20417, October 2023.
  • Huang et al. (2024) Yihao Huang, Felix Juefei-Xu, Qing Guo, Jie Zhang, Yutong Wu, Ming Hu, Tianlin Li, Geguang Pu, and Yang Liu. Personalization as a shortcut for few-shot backdoor attack against text-to-image diffusion models. Proceedings of the AAAI Conference on Artificial Intelligence, 38(19):21169–21178, Mar. 2024. doi: 10.1609/aaai.v38i19.30110. URL https://ojs.aaai.org/index.php/AAAI/article/view/30110.
  • Ji et al. (2023) Ziwei Ji, Nayeon Lee, Rita Frieske, Tiezheng Yu, Dan Su, Yan Xu, Etsuko Ishii, Ye Jin Bang, Andrea Madotto, and Pascale Fung. Survey of hallucination in natural language generation. ACM Computing Surveys, 55(12):1–38, mar 2023. ISSN 0360-0300. doi: 10.1145/3571730. URL https://doi.org/10.1145/3571730.
  • Karras et al. (2022) Tero Karras, Miika Aittala, Timo Aila, and Samuli Laine. Elucidating the design space of diffusion-based generative models, 2022. URL https://arxiv.org/abs/2206.00364.
  • Li et al. (2022a) Junnan Li, Dongxu Li, Caiming Xiong, and Steven Hoi. BLIP: Bootstrapping language-image pre-training for unified vision-language understanding and generation. In Kamalika Chaudhuri, Stefanie Jegelka, Le Song, Csaba Szepesvari, Gang Niu, and Sivan Sabato (eds.), Proceedings of the 39th International Conference on Machine Learning, volume 162 of Proceedings of Machine Learning Research, pp.  12888–12900. PMLR, 17–23 Jul 2022a.
  • Li et al. (2022b) Yi Li, Rameswar Panda, Yoon Kim, Chun-Fu Richard Chen, Rogerio S Feris, David Cox, and Nuno Vasconcelos. Valhalla: Visual hallucination for machine translation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp.  5216–5226, 2022b.
  • Luccioni et al. (2023) Sasha Luccioni, Christopher Akiki, Margaret Mitchell, and Yacine Jernite. Stable bias: Evaluating societal representations in diffusion models. In Thirty-seventh Conference on Neural Information Processing Systems Datasets and Benchmarks Track, pp.  1–14, 2023.
  • Luo et al. (2024) Hanjun Luo, Ziye Deng, Ruizhe Chen, and Zuozhu Liu. Faintbench: A holistic and precise benchmark for bias evaluation in text-to-image models, 2024. URL https://arxiv.org/abs/2405.17814.
  • Mehrabi et al. (2021) Ninareh Mehrabi, Fred Morstatter, Nripsuta Saxena, Kristina Lerman, and Aram Galstyan. A survey on bias and fairness in machine learning. ACM Computing Surveys, 54(6):1–35, 2021.
  • Naik & Nushi (2023) Ranjita Naik and Besmira Nushi. Social biases through the text-to-image generation lens. arXiv preprint arXiv:2304.06034, 2023.
  • Ni et al. (2021) Jianmo Ni, Gustavo Hernández Ábrego, Noah Constant, Ji Ma, Keith B. Hall, Daniel Cer, and Yinfei Yang. Sentence-t5: Scalable sentence encoders from pre-trained text-to-text models, 2021. URL https://arxiv.org/abs/2108.08877.
  • Podell et al. (2023) Dustin Podell, Zion English, Kyle Lacey, Andreas Blattmann, Tim Dockhorn, Jonas Müller, Joe Penna, and Robin Rombach. Sdxl: Improving latent diffusion models for high-resolution image synthesis, 2023. URL https://arxiv.org/abs/2307.01952.
  • Radford et al. (2021) Alec Radford, Jong Wook Kim, Chris Hallacy, et al. Learning transferable visual models from natural language supervision. In Marina Meila and Tong Zhang (eds.), Proceedings of the 38th International Conference on Machine Learning, volume 139 of Proceedings of Machine Learning Research, pp.  8748–8763. PMLR, 18–24 Jul 2021.
  • Ramesh et al. (2022) Aditya Ramesh, Prafulla Dhariwal, Alex Nichol, Casey Chu, and Mark Chen. Hierarchical text-conditional image generation with clip latents. arXiv preprint arXiv:2204.06125, 2022.
  • Rawte et al. (2023) Vipula Rawte, Amit Sheth, and Amitava Das. A survey of hallucination in large foundation models, 2023. URL https://arxiv.org/abs/2309.05922.
  • Rombach et al. (2022) Robin Rombach, Andreas Blattmann, Dominik Lorenz, Patrick Esser, and Björn Ommer. High-resolution image synthesis with latent diffusion models. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp.  10684–10695, June 2022.
  • Ronneberger & Fischer (2015) Olaf Ronneberger and Thomas Fischer, Philippand Brox. U-net: Convolutional networks for biomedical image segmentation. In Nassir Navab, Joachim Hornegger, William M. Wells, and Alejandro F. Frangi (eds.), Medical Image Computing and Computer-Assisted Intervention – MICCAI 2015, pp.  234–241, Cham, 2015. Springer International Publishing.
  • Ruiz et al. (2023) Nataniel Ruiz, Yuanzhen Li, Varun Jampani, Yael Pritch, Michael Rubinstein, and Kfir Aberman. Dreambooth: Fine tuning text-to-image diffusion models for subject-driven generation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp.  22500–22510, June 2023.
  • Saharia et al. (2022) Chitwan Saharia, William Chan, Saurabh Saxena, et al. Photorealistic text-to-image diffusion models with deep language understanding. In S. Koyejo, S. Mohamed, A. Agarwal, D. Belgrave, K. Cho, and A. Oh (eds.), Advances in Neural Information Processing Systems, volume 35, pp.  36479–36494, 2022. URL https://proceedings.neurips.cc/paper_files/paper/2022/file/ec795aeadae0b7d230fa35cbaf04c041-Paper-Conference.pdf.
  • Schramowski et al. (2023) Patrick Schramowski, Manuel Brack, Björn Deiseroth, and Kristian Kersting. Safe latent diffusion: Mitigating inappropriate degeneration in diffusion models. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp.  22522–22531, June 2023.
  • Seshadri et al. (2023) Preethi Seshadri, Sameer Singh, and Yanai Elazar. The bias amplification paradox in text-to-image generation. arXiv preprint arXiv:2308.00755, 2023.
  • Sohl-Dickstein et al. (2015) Jascha Sohl-Dickstein, Eric Weiss, Niru Maheswaranathan, and Surya Ganguli. Deep unsupervised learning using nonequilibrium thermodynamics. In International conference on machine learning, pp.  2256–2265. PMLR, 2015.
  • Song et al. (2020a) Jiaming Song, Chenlin Meng, and Stefano Ermon. Denoising diffusion implicit models. arXiv preprint arXiv:2010.02502, 2020a.
  • Song et al. (2020b) Yang Song, Jascha Sohl-Dickstein, Diederik P Kingma, Abhishek Kumar, Stefano Ermon, and Ben Poole. Score-based generative modeling through stochastic differential equations. arXiv preprint arXiv:2011.13456, 2020b.
  • Teo et al. (2024) Christopher Teo, Milad Abdollahzadeh, and Ngai-Man Man Cheung. On measuring fairness in generative models. Advances in Neural Information Processing Systems, 36, 2024.
  • Vice et al. (2023) Jordan Vice, Naveed Akhtar, Richard Hartley, and Ajmal Mian. Quantifying bias in text-to-image generative models. arXiv preprint arXiv:2312.13053, 2023.
  • Xiao et al. (2020) Qingguo Xiao, Guangyao Li, and Qiaochuan Chen. Image outpainting: Hallucinating beyond the image. IEEE Access, 8:173576–173583, 2020.
  • Zhang et al. (2023) Cheng Zhang, Xuanbai Chen, Siqi Chai, Chen Henry Wu, Dmitry Lagun, Thabo Beeler, and Fernando De la Torre. Iti-gen: Inclusive text-to-image generation. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), pp.  3969–3980, October 2023.

Appendices

Appendix A: Full bias evaluations results of 103 text-to-image generative models. Evaluations are reported in BlogB_{\log} ascending order. Truncated results in Table 1 of the main manuscript are a sub-set of the full results presented here.

Model Task Category Denoiser Resolution Release (dd/mm/yy) 𝒮pop.\mathcal{S}_{pop.} BDB_{D} HJH_{J} MGM_{G} BlogB_{\log}
Envvi/Inkpunk-Diffusion art PNDMScheduler 512 x 512 25/11/22 7.2323 18.9000 0.5346 0.0033 -2.1711
Yntec/beLIEve photo realism DPMSolverMultistepScheduler 768 x 768 01/08/24 5.2547 17.6176 0.5083 0.0000 -2.1589
segmind/SSD-1B photo realism EulerDiscreteScheduler 1024 x 1024 19/10/23 6.7116 15.7000 0.4747 0.0000 -2.1098
RunDiffusion/Juggernaut-X-v10 foundation EulerDiscreteScheduler 1024 x 1024 20/04/24 6.8125 16.3571 0.4992 0.0000 -2.1031
prompthero/openjourney-v4 photo realism PNDMScheduler 512 x 512 12/12/22 8.4414 15.9211 0.4881 0.0000 -2.0981
Lykon/dreamshaper-8 photo realism DEISMultistepScheduler 512 x 512 27/08/23 6.8769 17.3947 0.5467 0.0000 -2.0649
RunDiffusion/Juggernaut-XL-v9 foundation DDPMScheduler 1024 x 1024 19/02/24 8.6025 14.4048 0.4847 0.0000 -2.0046
stabilityai/sd-turbo foundation EulerDiscreteScheduler 512 x 512 28/11/23 7.9498 14.5476 0.4930 0.0000 -1.9982
eienmojiki/Anything-XL animation / anime EulerAncestralDiscreteScheduler 1024 x 1024 11/03/24 5.6000 14.8333 0.5287 0.0000 -1.9446
stabilityai/stable-diffusion-3.5-medium foundation FlowMatchEulerDiscreteScheduler 1024 x 1024 29/10/24 8.2481 14.0455 0.5049 0.0000 -1.9393
MirageML/dreambooth-nike photo realism PNDMScheduler 512 x 512 01/11/22 3.3402 14.4048 0.5206 0.0000 -1.9323
dataautogpt3/ProteusV0.3 foundation EulerDiscreteScheduler 1024 x 1024 13/02/24 7.3949 14.5833 0.5324 0.0000 -1.9196
stablediffusionapi/juggernaut-reborn foundation PNDMScheduler 512 x 512 21/01/24 5.0168 13.9783 0.5073 0.0167 -1.9129
hongdthaui/ManmaruMix_v30 animation / anime PNDMScheduler 512 x 512 19/01/24 4.4554 16.2391 0.5857 0.0033 -1.9029
digiplay/Photon_v1 photo realism EulerDiscreteScheduler 768 x 768 09/06/23 6.6038 13.4583 0.5195 0.0000 -1.8667
lambdalabs/miniSD-diffusers foundation PNDMScheduler 512 x 512 24/11/22 5.5457 14.6600 0.5640 0.0000 -1.8550
nitrosocke/mo-di-diffusion animation / anime PNDMScheduler 512 x 512 28/10/22 7.4843 14.3500 0.5696 0.0033 -1.8175
Lykon/DreamShaper photo realism PNDMScheduler 512 x 512 17/03/23 9.2373 13.7000 0.5521 0.0000 -1.8142
ItsJayQz/GTA5_Artwork_Diffusion animation / anime PNDMScheduler 512 x 512 13/12/22 6.9009 12.9348 0.5257 0.0033 -1.8106
digiplay/PerfectDeliberate-Anime_v2 animation / anime EulerDiscreteScheduler 768 x 768 07/04/24 5.9243 14.6923 0.5863 0.0000 -1.8048
RunDiffusion/Juggernaut-XL-v6 foundation EulerDiscreteScheduler 1024 x 1024 22/02/24 5.9995 14.1296 0.5766 0.0000 -1.7889
Lykon/AbsoluteReality photo realism PNDMScheduler 512 x 512 01/06/23 5.3458 12.6923 0.5297 0.0000 -1.7866
segmind/Segmind-Vega photo realism EulerDiscreteScheduler 1024 x 1024 01/12/23 6.1051 12.8462 0.5427 0.0000 -1.7706
nitrosocke/redshift-diffusion animation / anime PNDMScheduler 512 x 512 07/11/22 6.5239 12.0769 0.5139 0.0000 -1.7699
stabilityai/stable-diffusion-3.5-large foundation FlowMatchEulerDiscreteScheduler 1024 x 1024 22/10/24 9.2260 11.5769 0.4939 0.0000 -1.7680
circulus/canvers-real-v3.9.1 photo realism PNDMScheduler 512 x 512 05/05/24 3.8623 12.4615 0.5308 0.0000 -1.7658
digiplay/AbsoluteReality_v1.8.1 photo realism DDIMScheduler 768 x 768 04/08/23 6.5521 11.6481 0.5034 0.0000 -1.7552
Yntec/YiffyMix animation / anime PNDMScheduler 512 x 512 24/10/23 5.9138 12.1296 0.5259 0.0000 -1.7492
Yntec/RealLife photo realism EulerDiscreteScheduler 768 x 768 04/01/24 5.5369 11.9828 0.5216 0.0000 -1.7461
digiplay/MilkyWonderland_v1 animation / anime EulerDiscreteScheduler 768 x 768 30/09/23 4.8211 12.6538 0.5394 0.0167 -1.7458
aipicasso/emi-3 animation / anime FlowMatchEulerDiscreteScheduler 1024 x 1024 05/12/24 5.0561 11.2140 0.4891 0.0000 -1.7457
WarriorMama777/AbyssOrangeMix2 animation / anime PNDMScheduler 512 x 512 30/01/23 7.8778 12.3400 0.5353 0.0067 -1.7397
WarriorMama777/AbyssOrangeMix animation / anime PNDMScheduler 512 x 512 30/01/23 7.8316 11.5000 0.5096 0.0000 -1.7299
liamhvn/disney-pixar-cartoon-b animation / anime PNDMScheduler 512 x 512 12/07/23 4.9147 12.8548 0.5567 0.0167 -1.7235
openart-custom/CrystalClearXL photo realism EulerDiscreteScheduler 1024 x 1024 13/08/24 5.5520 12.2308 0.5464 0.0000 -1.7134
stablediffusionapi/sdxl-unstable-diffusers-y foundation EulerDiscreteScheduler 1024 x 1024 08/10/23 5.1028 11.3462 0.5151 0.0000 -1.7052
dataautogpt3/ProteusV0.2 foundation KDPM2AncestralDiscreteScheduler 1024 x 1024 19/01/24 7.1317 11.4655 0.5207 0.0000 -1.7040
stabilityai/stable-diffusion-2-1 foundation DDIMScheduler 512 x 512 07/12/22 10.5860 11.7963 0.5349 0.0100 -1.6921
nitrosocke/Arcane-Diffusion animation / anime LMSDiscreteScheduler 512 x 512 02/10/22 7.1673 10.9074 0.5004 0.0067 -1.6887
cagliostrolab/animagine-xl-3.1 animation / anime EulerAncestralDiscreteScheduler 1024 x 1024 13/03/24 8.9596 11.2143 0.5179 0.0000 -1.6876
nuigurumi/basil_mix art PNDMScheduler 512 x 512 04/01/23 7.0493 12.1800 0.5598 0.0000 -1.6792
fluently/Fluently-XL-Final foundation EulerAncestralDiscreteScheduler 1024 x 1024 06/06/24 6.6590 10.8226 0.5147 0.0000 -1.6586
CompVis/stable-diffusion-v1-4 foundation PNDMScheduler 512 x 512 20/08/22 10.7885 11.7258 0.5621 0.0000 -1.6360
SPO-Diffusion-Models/SPO-SDXL_4k-p_10ep foundation EulerDiscreteScheduler 1024 x 1024 07/06/24 6.2646 11.1667 0.5426 0.0000 -1.6307
krnl/realisticVisionV51_v51VAE photo realism PNDMScheduler 512 x 512 12/01/24 5.3375 11.1667 0.5510 0.0000 -1.6122
openart-custom/AlbedoBase foundation EulerDiscreteScheduler 1024 x 1024 13/09/24 5.5861 11.0161 0.5462 0.0000 -1.6093
xyn-ai/anything-v4.0 animation / anime PNDMScheduler 512 x 512 23/03/23 6.4597 11.6250 0.5692 0.0067 -1.6044
lemon2431/toonify_v20 animation / anime PNDMScheduler 512 x 512 16/10/23 4.1148 10.9412 0.5469 0.0033 -1.5976
SG161222/RealVisXL_V4.0 photo realism EulerDiscreteScheduler 1024 x 1024 13/02/24 8.6080 10.6250 0.5405 0.0000 -1.5856
ckpt/anything-v4.5 art PNDMScheduler 512 x 512 19/01/23 5.8262 11.5968 0.5784 0.0033 -1.5837
emilianJR/chilloutmix_NiPrunedFp32Fix photo realism PNDMScheduler 512 x 512 19/04/23 6.0849 10.7941 0.5498 0.0000 -1.5810
Kernel/sd-nsfw photo realism PNDMScheduler 512 x 512 15/07/23 5.5154 11.5333 0.5828 0.0000 -1.5711
Lykon/AAM_XL_AnimeMix animation / anime EulerDiscreteScheduler 1024 x 1024 19/01/24 6.8090 9.0152 0.4843 0.0000 -1.5367
stablediffusionapi/realistic-stock-photo photo realism EulerDiscreteScheduler 1024 x 1024 22/10/23 5.1367 10.2188 0.5451 0.0000 -1.5366
GraydientPlatformAPI/comicbabes2 art PNDMScheduler 512 x 512 07/01/24 4.0530 10.0313 0.5447 0.0033 -1.5156
SG161222/Realistic_Vision_V6.0_B1_noVAE photo realism PNDMScheduler 896 x 896 29/11/23 7.7733 11.1571 0.5957 0.0000 -1.5066
scenario-labs/juggernaut_reborn photo realism DPMSolverMultistepScheduler 512 x 512 29/05/24 4.5414 10.0294 0.5504 0.0000 -1.5061
SG161222/RealVisXL_V5.0 photo realism DDIMScheduler 1024 x 1024 05/08/24 6.6694 9.9412 0.5482 0.0000 -1.5022
stable-diffusion-v1-5/stable-diffusion-v1-5 foundation PNDMScheduler 512 x 512 07/09/24 9.2497 9.6429 0.5361 0.0000 -1.4981
Corcelio/openvision foundation EulerDiscreteScheduler 1024 x 1024 13/05/24 5.7229 9.2576 0.5161 0.0033 -1.4963
gsdf/Counterfeit-V2.5 animation / anime DDIMScheduler 512 x 512 02/02/23 8.8241 12.2059 0.6204 0.0433 -1.4890
fluently/Fluently-XL-v4 foundation EulerAncestralDiscreteScheduler 1024 x 1024 02/05/24 6.8337 9.2222 0.5205 0.0000 -1.4866
Ojimi/anime-kawai-diffusion animation / anime DEISMultistepScheduler 512 x 512 24/03/23 6.2192 11.1667 0.5968 0.0200 -1.4843
tilake/China-Chic-illustration art PNDMScheduler 512 x 512 15/01/23 5.7534 10.0294 0.5655 0.0000 -1.4720
digiplay/FormCleansingMix_v1 animation / anime DPMSolverMultistepScheduler 768 x 768 20/06/23 4.4630 10.8243 0.5936 0.0167 -1.4647
Lykon/dreamshaper-7 photo realism DEISMultistepScheduler 512 x 512 27/08/23 6.7903 10.0625 0.5704 0.0000 -1.4638
gligen/diffusers-generation-text-box photo realism PNDMScheduler 512 x 512 11/03/23 5.5617 10.0294 0.5719 0.0000 -1.4570
stabilityai/stable-diffusion-xl-base-1.0 photo realism EulerDiscreteScheduler 1024 x 1024 25/07/23 11.1360 8.6515 0.5076 0.0000 -1.4492
pt-sk/stable-diffusion-1.5 foundation PNDMScheduler 512 x 512 02/03/24 5.0628 9.5556 0.5584 0.0000 -1.4397
playgroundai/playground-v2.5-1024px-aesthetic art EDMDPMSolverMultistepScheduler 1024 x 1024 17/02/24 8.8552 9.5270 0.5649 0.0000 -1.4219
danbrown/RevAnimated-v1-2-2 animation / anime DDIMScheduler 256 x 256 01/05/23 5.0277 8.5000 0.5134 0.0000 -1.4198
redstonehero/animesh_prunedv21 animation / anime PNDMScheduler 512 x 512 17/08/23 4.0369 9.6622 0.5705 0.0033 -1.4197
dreamlike-art/dreamlike-photoreal-2.0 photo realism DDIMScheduler 768 x 768 04/01/23 8.6776 8.8415 0.5451 0.0033 -1.3885
emilianJR/epiCRealism photo realism PNDMScheduler 512 x 512 25/06/23 6.9505 8.8846 0.5499 0.0067 -1.3793
naclbit/trinart_characters_19.2m_stable_diffusion_v1 animation / anime PNDMScheduler 512 x 512 15/10/22 6.2754 12.3750 0.6536 0.0767 -1.3757
segmind/tiny-sd photo realism DPMSolverMultistepScheduler 512 x 512 28/07/23 5.7272 10.1757 0.6124 0.0000 -1.3722
yodayo-ai/kivotos-xl-2.0 animation / anime EulerAncestralDiscreteScheduler 1024 x 1024 02/06/24 6.5836 7.1410 0.4717 0.0000 -1.3277
ZB-Tech/Text-to-Image photo realism PNDMScheduler 512 x 512 10/03/24 6.4625 8.7162 0.5697 0.0000 -1.3220
digiplay/aurorafantasy_v1 animation / anime EulerDiscreteScheduler 768 x 768 06/04/24 5.2524 8.8864 0.5813 0.0133 -1.3005
stablediffusionapi/mklan-xxx-nsfw-pony photo realism EulerDiscreteScheduler 1024 x 1024 29/05/24 6.5259 7.4744 0.5100 0.0000 -1.2981
stabilityai/sdxl-turbo foundation EulerAncestralDiscreteScheduler 512 x 512 27/11/23 10.1726 8.0778 0.5493 0.0000 -1.2922
Lykon/dreamshaper-xl-v2-turbo foundation EulerDiscreteScheduler 1024 x 1024 08/02/24 6.6775 6.7381 0.4679 0.0000 -1.2769
dataautogpt3/OpenDalleV1.1 foundation KDPM2AncestralDiscreteScheduler 1024 x 1024 22/12/23 7.7311 7.6951 0.5351 0.0000 -1.2747
Corcelio/mobius foundation EulerDiscreteScheduler 1024 x 1024 13/05/24 6.0584 7.2273 0.5280 0.0000 -1.2271
kandinsky-community/kandinsky-2-1 art DDIMScheduler 512 x512 24/05/23 6.5896 7.1735 0.5332 0.0000 -1.2086
friedrichor/stable-diffusion-2-1-realistic photo realism DDIMScheduler 512 x 512 04/06/23 4.5053 6.7857 0.5061 0.0033 -1.2060
CompVis/stable-diffusion-v1-1 foundation PNDMScheduler 512 x 512 19/08/22 6.5538 6.8864 0.5218 0.0200 -1.1716
stablediffusionapi/realistic-vision-51 photo realism PNDMScheduler 512 x 512 07/08/23 5.8350 6.9490 0.5456 0.0000 -1.1498
UnfilteredAI/NSFW-gen-v2 photo realism EulerAncestralDiscreteScheduler 1024 x 1024 15/04/24 6.8121 6.4111 0.5102 0.0000 -1.1442
nitrosocke/Ghibli-Diffusion animation / anime PNDMScheduler 704 x 512 18/11/22 7.6340 6.3478 0.5523 0.0000 -1.0444
dataautogpt3/ProteusV0.4-Lightning foundation EulerDiscreteScheduler 1024 x 1024 22/02/24 6.2324 4.6739 0.4055 0.0000 -1.0220
SG161222/Realistic_Vision_V2.0 photo realism PNDMScheduler 512 x 512 21/03/23 8.3000 6.1154 0.5480 0.0000 -1.0167
sd-community/sdxl-flash foundation DPMSolverSinglestepScheduler 1024 x 1024 19/05/24 7.2798 4.7826 0.4424 0.0000 -0.9809
Mitsua/mitsua-diffusion-cc0 art PNDMScheduler 512 x 512 22/12/22 5.1305 10.8947 0.7426 0.0900 -0.9368
OnomaAIResearch/Illustrious-xl-early-release-v0 animation / anime EulerDiscreteScheduler 1024 x 1024 20/09/24 7.3851 11.6000 0.7524 0.1700 -0.8688
digiplay/ZHMix-Dramatic-v2.0 animation / anime EulerDiscreteScheduler 768 x 768 03/12/23 4.7306 5.7800 0.6506 0.0333 -0.6689
DGSpitzer/Cyberpunk-Anime-Diffusion animation / anime PNDMScheduler 704 x 704 28/10/22 6.7796 2.9722 0.5845 0.0600 -0.1492
Emanon14/NONAMEmix_v1 animation / anime EulerAncestralDiscreteScheduler 1024 x 1024 23/11/24 5.1821 5.3400 0.7447 0.1700 -0.1237
Onodofthenorth/SD_PixelArt_SpriteSheet_Generator art PNDMScheduler 512 x 512 01/11/22 6.4996 6.5930 0.7898 0.3367 0.0841
Niggendar/duchaitenPonyXLNo_ponyNoScoreV40 art EulerDiscreteScheduler 1024 x 1024 01/06/24 5.1912 3.1620 0.7458 0.1000 0.3237
lambdalabs/sd-pokemon-diffusers animation / anime PNDMScheduler 512 x 512 16/09/22 6.3118 9.6509 0.8639 0.6033 0.6519
Raelina/Raehoshi-illust-XL-3 animation / anime EulerDiscreteScheduler 1024 x 1024 11/12/24 3.7427 4.8772 0.8926 0.6700 1.7549
monadical-labs/minecraft-skin-generator-sdxl animation / anime EulerDiscreteScheduler 768 x 768 19/02/24 5.3317 5.9786 0.9721 0.7933 3.3680