Do AI Image Generators See the Pacific Region? - Pacific Broadband and Digital Equity

(Left) a Samoan student at American Samoa Community College. (Right) A college student at Miramar Community College in San Diego. Depictions by Midjourney.

AI Image Generators like Dall-E, Midjourney, Stability AI and Meta Imagine have extensive visual vocabularies, but what do they know about the Pacific Region and its people? Judge for yourself with the gallery.

Last week, my team at American Samoa Community College started planning the development of a new program website. We discussed possible sources for site imagery, and this seemed like an opportunity to check out some of the Artificial Intelligence (AI)-based image generation sites that are increasingly popular as we approach the start of 2024.

Working with a couple of the popular sites, including OpenAI’s Dall-E, Midjourney, Stability AI, Meta’s Imagine, and others, I experimented with images we could use for the website; if not permanently, at least placeholders that might later be substituted with actual photographs.

At this point, a quick note: The Midjourney homepage/gallery in particular can occasionally be NSFW–in the somewhat the same way that a museum might be–depending upon what the latest user posts happen to look like. Just a warning for those who might need one.

Website art “filler” is just the kind of chore that Generative AI promises to make short work of in the near future. Unfortunately, in our case, none of the tools worked well. The AIs generated images including simulated people without the typical attire, cultural mannerisms, or other context that “feels real” to everyday life in American Samoa. Digging deeper into the pictures, many of the places and situations depicted also seemed unrealistic.

Out of curiosity, I then tried prompting the same AIs with requests for college scenes from the mainland United States. The image at the top of this page illustrates two of the results paneled side-by-side. Generated by the AI engine Midjourney, the left panel contains the image I received by prompting (asking the AI) for “A student standing outside American Samoa Community College, checking her phone to see if her instructor has emailed her the latest assignment.” On the right panel, the image is “A student standing outside Miramar Community College in San Diego, checking her phone to see if her instructor has emailed her the latest assignment.”

While the Midjourney engine captures the physical ASCC campus setting reasonably well, the student depicted does not seem representative in their appearance, dress, hairstyle, and so on. On the other hand, the imagined student from San Diego looks right at home for a California campus. I am not picking on Midjourney in particular for this example: it was actually the best performer out of the AIs mentioned above at handling my particular request.

Gallery: 15 Image Comparisons between places and AI Engines

Prompts
Top: "Samoan teenagers at a tech hub in Apia, learning coding and digital media production on their laptops and tablets."

Bottom: "Teenagers in New England at a community tech center, working on app development and digital arts projects on their devices."

Prompts
Top: "Polynesian performers at a contemporary music and arts festival in Hawaii, incorporating traditional elements in their modern performances."

Bottom: "A group at a Vermont music festival, showcasing a mix of indie and traditional New England musical styles."

Prompts
Top: "Chamorro youths in Guam participating in a digital photography course, capturing the island's natural beauty with advanced cameras."

Bottom: "Londoners attending an advanced drone photography workshop, capturing the dynamic cityscape along the Thames."

Prompts
Top: "Polynesian navigators in a modern sailing vessel using GPS and traditional star navigation off the coast of Tahiti."

Bottom: "Yacht enthusiasts off the coast of Maine navigating with the latest technology, enjoying the Atlantic seascapes."

Prompts
Top: "A Samoan family hosting a backyard barbecue, blending traditional dishes with modern cooking methods."

Bottom: "A family in the English countryside using a modern outdoor kitchen to prepare a mix of local and international cuisine."

Prompts
Top: "Micronesian children in a community center using tablets for educational games and learning about their cultural heritage."

Bottom: "Children in a Massachusetts park engaging with augmented reality games on their tablets."

Prompts
Top: "A community event in Guam featuring live music blending traditional Chamorro and contemporary styles, with vibrant decorations."

Bottom: "A traditional English village fete, updated with contemporary music acts and modern artisanal food stalls."

Prompts
Top: "Samoan cultural practitioners performing a modern interpretation of the 'siva tau' dance in a cultural exhibition."

Bottom: "Historical reenactors in New England using multimedia tools to enhance the authenticity of their period costumes and settings."

Prompts
Top: "A Polynesian entrepreneur in a co-working space in Honolulu, conducting an international business meeting online."

Bottom: "A British entrepreneur in a high-tech London office, leading a global video conference surrounded by modern art."

Prompts
Top: "Drone footage capturing the breathtaking scenery of a Micronesian atoll, highlighting the blend of nature and local communities."

Bottom: "Drone views of a picturesque New England town, highlighting its historical charm blended with modern living."

Prompts
Top: "Micronesian marine biologists using modern equipment to monitor coral reefs and marine ecosystems."

Bottom: "UK environmental scientists in a state-of-the-art lab, researching innovative conservation techniques."

Prompts
Top: "A lively night market in Samoa, featuring local artisans, contemporary food stalls, and live entertainment."

Bottom: "A New England evening market, with vendors selling modern crafts, gourmet local foods, and featuring a live band."

Prompts
Top: "Polynesian tattoo artists in a modern studio, combining traditional designs with new-age tattoo technology."

Bottom: "Contemporary artists in a London studio, experimenting with digital art forms alongside traditional techniques."

Prompts
Top: "A cozy family beachside gathering near Vaitogi, American Samoa."

Bottom: "A Providence RI-based family home, with a cozy family gathering."

Prompts
Top: "Fishermen in Samoa using modern fishing techniques and equipment, showcasing a sunset at the harbor."

Bottom: "Fishermen in Cornwall using sustainable fishing practices and modern equipment, with a sunset over the coastline."

Creating a Test

In order to satisfy my curiosity about how effectively AI image generators could be used in Pacific regional work contexts, I set out to build a demonstration of various prompts that reference both the Pacific region and scenes from mainland America and the UK, where many AIs are being developed. Inspired by a somewhat similar project at Gizmodo, (“Which AI Image Generator is the Best“, published earlier this month) I decided to pick two of the most popular image AI tools (Dall-E and Midjourney) and set out to build equivalent prompts of the same types of activities, but with settings based either on the continent or in the islands.

To keep my prompts as “neutral” as possible, I created a list of basic activities and scenes and asked ChatGPT 4 to give me prompts set in Samoa/American Samoa, Guam and Saipan, Micronesia and Polynesia–and then counterparts from New England or London. So, for instance:

"Chamorro youths in Guam participating in a digital photography course, capturing the island's natural beauty with advanced cameras."

and then the corresponding prompt:

"Londoners attending an advanced drone photography workshop, capturing the dynamic cityscape along the Thames."

I fed 15 of these sets into both the Dall-E and Midjourney generators–the results of which were over a hundred responses from each of the two AIs. There were a couple of instances where I had to tweak a prompt to get one of the image bots to behave itself, but I tried to stay as faithful to the list as possible. From the hundreds of results, I handpicked the most credible result for each prompt by each AI. Finally, I used a little bit of creative Python programming to put all the images together in the gallery above.

Thinking About Guam and Hawai‘i

While the state of Hawaiʻi and, to a different degree, the island of Guam enjoy relative wealth and technology compared to most Pacific Island counterparts, a colleague astutely pointed out that everyday dress and activities in these locations often resemble the US mainland more than Pacific Island norms. This observation piqued my curiosity, prompting a second, smaller experiment comparing AI-generated visualizations for Hawaiʻi and Guam with similar scenes from Texas and Colorado.

I simplified the process by sticking with Midjourney for all of these tests (rather than using multiple AIs). I generated images using the prompts:

1. Father and daughter shop at a supermarket in [Place]
2. Three college students study outside on campus in [Place]

Here are some of the results, with the “Place” prompt listed below each image. Midjourney automatically generates four images for every prompt, which is why we have four blocks in each entry in the gallery below. This way, we get to see a variety of ways the AI engine “thinks” about the places and people listed.

Hawaii
Guam
Texas
Colorado
University of Hawaii
University of Guam
University of Texas
Colorado University

All images generated by Midjourney AI.

True to the insight from my colleague, it seems undeniable that there is less of a disparity in “real-life fidelity loss” between the locations in the imagery Midjourney created for this second test than there are for scenes set in Samoa or regions of Micronesia outside of Guam. At first glance, a significant factor seems to be the realism in clothing worn by the imaginary subjects.

Still, even with these seemingly familiar settings, a nagging feeling persists that something remains subtly off for scenes in Guam and Hawaiʻi. This discrepancy hints at the need for deeper, culturally-informed exploration of AI image generation in these and other geographically distinct regions. It’s not just about ensuring accurate depictions of human figures, but also the broader “place settings.” Architectural styles, natural landscapes, and even everyday objects all contribute to the unique character of a place, and AI would ideally learn to capture them as well.

Results and Takeaways

"A lively night market in Samoa, featuring local artisans, contemporary food stalls, and live entertainment." Imagine with Meta AI. Meta seems to be relatively good at generating regionally-specific imagery, but frequently--as of December 2023, anyway--gets basic details incorrect for the prompts it receives.

As we look at how AI image generators depict the Pacific region, it’s important to be cautious before leaping to conclusions. There are clear areas where these tools don’t quite capture the region accurately, but we would need much more information to say whether this is due to biases in their development, inattention from developers, or a simple lack of available training data from regional sources. This is a complex issue that needs more in-depth study. We present the image gallery with the goal of allowing people to make their own assessments, in the hopes of spurring more awareness on the topic.

While I have tried to approach the testing of generative AI for images in a systematic way, there’s definitely room to add more structure and depth to this investigation. In fact, there’s a growing body of scientific research focusing on fairness and equity in AI image generation, and more broadly, in all aspects of generative AI. This body of work highlights the complexities and ongoing challenges in the field, indicating areas where further study and improvement are needed.

Unlike chatbots, text-to-image AI training mechanisms face complexities in at least two key areas that can introduce bias: the visual libraries that train the ‘diffusion-engines’ and the language processing critical for interpreting user requests. This dual challenge involves not just understanding visual aspects but also accurately grasping the linguistic context of user inputs. While extensive research details these general biases, there’s a notable gap in studies specifically focusing on the Pacific region. The unique characteristics of the region demand a more detailed analysis to isolate and effectively address these specific biases.

Finally, while Hawaiʻi and Guam may share certain technological and economic traits with the US mainland, their cultural identities remain distinct. Exploring how AI image generation can accurately portray these nuances – in both people and place – represents a crucial next step in ensuring responsible and inclusive technologies for everyone. This journey requires diverse training data sets, culturally informed algorithms, and active collaboration with local communities. We’ve only just begun to scratch the surface of what’s possible, and Guam and Hawaiʻi offer valuable testing grounds for shaping the future of AI representation in all its complexity.

Final Thoughts

“For now, my key takeaway is this: as of early 2024, AI image generation is an important technology—one of many—that still needs significant improvements to effectively and equitably serve the people of the Pacific region. Regardless of the reasons, morals, or politics surrounding this phenomenon, it’s clear that the people of the Pacific region deserve the best access to modern tools, and improvements are surely possible with this exciting new technology.

We are exploring ways to delve deeper into this topic in the coming months. In the meantime, we’ll continue our experiments with AI, and we’re eager to hear your thoughts on the images and your experiences with Generative AI, both in your daily work and personal life.”

This article was updated with the section on Hawai‘i and Guam in January 2024.