Small Models, Big Impact: Deploying LLM from Factory to Global Operations

Small Models, Big Impact: Deploying LLM from Factory to Global Operations

29 April 2025

Download Full Transcript

In this episode of PowerTalk, we invited Ian Chen, Director of the Intelligent Digital Technology Division at Compal. He shares insights on Compal’s AI deployment, explains language models, discusses LLM applications in manufacturing, and offers practical advice for enterprises.

LLM in Factory

Ian: Large language models (LLMs) are exceptionally strong at semantic understanding and data interpretation, which is why we see them as valuable tools for tackling challenges in manufacturing operations. Their ability to process and extract meaningful insights from vast datasets makes them especially useful in production environments, where data from multiple teams and processes can quickly become overwhelming. Traditionally, searching through massive volumes of manufacturing data requires technical knowledge of databases and a series of manual steps. But LLMs simplify this dramatically. By allowing users to ask questions in natural language—just like speaking—they enable quick access to relevant information and even generate summary reports automatically. This eliminates the need to read through lengthy documents or manually create summaries, significantly boosting factory-wide operational efficiency.

As manufacturers pursue digital transformation, one of the defining shifts is the explosive growth in data. The more digitized a factory becomes, the more data it produces. Yet the challenge lies in turning that raw data into valuable business insight.

Ian: This is where LLMs shine. They don’t just filter data—they contextualize it, summarize it, and transform it into actionable intelligence that supports both team-level and executive decision-making. This ties directly into a company’s broader transformation strategy, especially when it comes to ESG (Environmental, Social, and Governance) goals. ESG emphasizes sustainable value creation, reducing waste, and improving resource efficiency—all areas where LLMs can contribute. For instance, by leveraging their predictive capabilities, LLMs can help forecast when inventory restocking is necessary, optimize stock levels, and improve production scheduling—key steps toward leaner and greener operations. While many AI tools play roles in digital transformation, LLMs are currently the cornerstone for advanced data processing. When people refer to AI today, this is often what they mean. And now, we’re seeing the emergence of multimodal applications—tools that combine different types of inputs like text, sound, and visuals—entering the manufacturing domain. One critical use case is predictive maintenance. Equipment downtime is a major concern in manufacturing, and the best way to avoid it is to detect issues before they cause disruptions. Traditional maintenance strategies either respond after a failure (which is risky and costly) or follow a rigid schedule regardless of equipment condition (which leads to waste). Predictive maintenance, in contrast, allows repairs to occur just before failure—maximizing uptime while minimizing unnecessary interventions. LLMs can support this by analyzing structured and unstructured data—just like in stock market forecasting, where the more relevant data you have, the better your prediction. And the value increases even more when other dimensions are added. For example, abnormal sounds from a motor or visible cracks in a belt are signs a machine is about to fail, but they may not be captured in traditional datasets. By integrating visual, audio, and textual inputs, we can dramatically improve the accuracy of diagnostics and failure predictions. This is why we believe multimodal models will play a pivotal role in manufacturing’s future. They don’t just expand what AI can see and hear—they make the predictions more accurate and accessible, even for frontline managers who may not be technical experts. Of course, beyond maintenance, there are many other potential applications, such as generating creative content or automating documentation, which are also in development. But the fusion of sound, image, and LLM capabilities is already unlocking practical, user-friendly solutions like predictive maintenance—and doing so with a level of precision that traditional tools can’t match.

What Makes a Good LLM

【Tap CC for English Captions】Clicks to Watch the Video

lan:When we talk about what makes a model “good,” the first and most important factor is domain specificity. For example, if I work in electronics manufacturing, I want the model to handle content relevant to electronics—not a general-purpose AI like ChatGPT that knows a bit of everything, from astronomy to geography. I need it to understand electronics manufacturing deeply, including all the technical terminology that comes with the field. That level of specialization is essential. Secondly, the model must provide answers that are aligned with the context of the industry. If I ask a question related to electronics, I expect an answer grounded in that domain—not something pulled from finance or healthcare. Domain alignment is key to delivering value. That’s why we believe that, in the future, every industry will have its own dedicated language model—one for healthcare, one for finance, one for manufacturing, and so on. The next critical requirement is accuracy. If the model’s answers aren’t accurate, it simply isn’t useful. But accuracy alone isn’t enough—it also needs to be reliable. And reliability doesn’t just depend on the model itself; it requires supporting tools and techniques. This is where Retrieval-augmented Inference with Context (RIC) comes in. One major issue with today’s large language models (LLMs) is that they often give answers that aren’t grounded in the provided information. RIC allows us to constrain the model so that it only responds based on the material we supply, which enhances both accuracy and reliability. The third key factor is system integration. In factories, you have many systems—like MES and others. The model must be able to pull and process data from these systems. That means its answers and outputs need to be integrated directly into your operational ecosystem. This is what we call system integration capability. Finally, scalability is another important consideration. If the model solves one problem well, can it be applied to other use cases? In theory, yes—but each new domain typically requires a different knowledge base. That means you’ll need to supply new materials for each scenario, even though the model’s language capabilities remain consistent. The goal is to control the source of information the model draws from, ensuring that its responses are always rooted in trusted content.In summary, a truly effective model must be domain-specific, context-aware, accurate, reliable, easily integrated with existing systems, and scalable across similar challenges within or across industries.

Finally, scalability is another important consideration. If the model solves one problem well, can it be applied to other use cases? In theory, yes—but each new domain typically requires a different knowledge base. That means you’ll need to supply new materials for each scenario, even though the model’s language capabilities remain consistent. The goal is to control the source of information the model draws from, ensuring that its responses are always rooted in trusted content.

In summary, a truly effective model must be domain-specific, context-aware, accurate, reliable, easily integrated with existing systems, and scalable across similar challenges within or across industries.

Choosing between Big and Small LLMs

【Tap CC for English Captions】Clicks to Watch the Video

lan: The capability of a large language model (LLM) is often associated with its size, which refers to the number of parameters it contains. While many assume that bigger is automatically better, this isn’t always the case. For context, models like ChatGPT can have over 100 billion parameters, whereas smaller models may have fewer than 10 billion. This size difference directly impacts hardware requirements, so selecting a model really depends on the complexity of the task at hand.

If the application is relatively straightforward, a smaller model is usually sufficient—and much more cost-effective.

Ian: For example, in a collaboration with Budweiser, we used a 700 million parameter model to interpret specific operational data. It was trained on targeted industry terms that wouldn’t typically be found in open-source models, which often require fine-tuning for specialized use cases.

By customizing these models—either through vocabulary injection during training or document referencing at inference—we can tailor them to industry-specific needs.

Ian: This dual approach of fine-tuning and retrieval means the model can both “remember” pre-trained knowledge and fetch relevant information on demand. For instance, if asked why a production line stopped, the model can respond using insights pulled from over 1,000 historical documents—identifying similar past cases and offering possible solutions.

Suggestion for SMEs: Use “Small” Model

lan: However, small and medium enterprises (SMEs), especially in Taiwan, often face budget constraints when considering AI upgrades. Many hesitate due to the perceived high cost of large models and the infrastructure they require. But in reality, for functions like document search or report generation, a small model running on a basic PC or mid-range laptop—perhaps costing around NT$200,000—is more than enough. There’s no need to invest millions in AI servers. The issue isn’t lack of willingness, but limited financial capacity, which slows down industrial transformation compared to large corporations with more resources. That’s why model selection should be based on the actual use case, the task requirements, and available resources. The trend is shifting toward smaller, more efficient models that demand less hardware and energy. Not only are these models more affordable, but they’re also more sustainable. Large models consume significant power and aren’t aligned with ESG (Environmental, Social, and Governance) goals. Initially, the AI industry focused heavily on building massive models. But today, there’s a growing movement toward compact models that still deliver solid performance. While smaller models might fall slightly behind in accuracy, the trade-off is minimal and often worth the savings. The real question becomes: what can the model actually do for you?

That’s why model selection should be based on the actual use case, the task requirements, and available resources. The trend is shifting toward smaller, more efficient models that demand less hardware and energy. Not only are these models more affordable, but they’re also more sustainable. Large models consume significant power and aren’t aligned with ESG (Environmental, Social, and Governance) goals.

Initially, the AI industry focused heavily on building massive models. But today, there’s a growing movement toward compact models that still deliver solid performance. While smaller models might fall slightly behind in accuracy, the trade-off is minimal and often worth the savings. The real question becomes: what can the model actually do for you?

On Premise AI

If your needs involve simple tasks like applying face filters or providing quick responses on a mobile app, you don’t need a behemoth like GPT. The future lies in lightweight models that can be deployed on edge devices—smartphones, laptops, and tablets. This is also the direction Taiwan is embracing. For AI to truly scale, it must run efficiently on everyday user devices. With billions of mobile devices globally, deploying models to the edge unlocks immense power and reach.

Of course, deploying smaller models may come with slight drops in accuracy. But thanks to algorithmic advances, some claim that 1-billion-parameter models can now rival the performance of older 10-billion-parameter models. Ultimately, it’s not about how big the model is—it’s about what it can actually do.

How AI Develops in Manufacturing

【Tap CC for English Captions】Clicks to Watch the Video

Ian: AI 1.0 refers to the stage of artificial intelligence development that began around 2018. This era is characterized by what we call “discriminative AI”—systems designed to make binary judgments, such as the defect detection mentioned earlier. In manufacturing, for instance, AI 1.0 is used to determine whether a circuit board or assembly has any defects, missing components, or abnormalities. It evaluates whether something passes or fails, a simple yes-or-no decision. This stage also encompasses familiar technologies like facial recognition. All of these applications fall under the umbrella of AI 1.0, where the core function is discrimination: right or wrong, present or absent.

In contrast, AI 2.0 represents a shift toward generation. This new phase of AI can generate content based on the input or materials it receives. Rather than simply evaluating existing information, it produces new content—text, images, audio, and more. The key capability of AI 2.0 is its generative power, which defines this stage. Initially, AI 2.0 was mainly focused on large language models that generated text. But by early this year, companies began introducing what are called multimodal models. These go beyond just language and text—they incorporate and understand multiple types of media, including images, video, and sound. Today, we’re seeing a wide range of generative applications. You can input a sentence, and the model can create a video, generate an image, or even compose music based on the meaning of your input. This is the essence of generative AI—creating something from nothing. Naturally, many are now wondering: how can multimodal AI be applied in manufacturing? And that’s exactly the direction we plan to explore and focus on further this year.

AI’s FutureGAI & Robot

Ian: Sometimes, after investing millions into cutting-edge servers and AI systems, I realize the actual function delivered is minimal—something that could’ve been accomplished with a far less costly setup. It often feels like the investment wasn’t worth it. Technology is advancing at breakneck speed, but can industries realistically keep up? In truth, there’s a considerable gap between the pace of technological innovation and a company’s ability to adopt and implement it effectively. Even when the tech is available, very few companies can immediately leverage it. Adoption takes time—both to understand the technology and to align it with operational needs. So what does the endgame look like? For many leaders in the field, such as NVIDIA’s Jensen Huang, the vision is “digital twins”: a parallel world where virtual environments simulate and interact with the physical world. This concept requires tremendous computing power—something platforms like NVIDIA Omniverse aim to provide.

But NVIDIA isn’t just building for today’s simulation needs. If you follow recent developments, their real target is the robotics industry. Elon Musk, too, is shifting focus from autonomous vehicles to robotics. According to the chairman of TSMC, many of the world’s wealthiest innovators are now laser-focused on robotics. That’s the direction future development is headed: AI-powered robots.

This vision represents what many consider the ultimate application of AI—robots powered by general AI (gAI), also known as “strong AI.” Unlike today’s systems that are trained for specific tasks or industries, general AI would be capable of understanding and performing across many domains. That’s the dream. But right now, we’re still operating in the world of narrow, industry-specific AI.

The Gap Between Hype and Reality — Can AI Really Drive Profit for Businesses?

Ian: What businesses truly want is AI that isn’t just capable in one area, but able to understand and adapt to many. The direction is clear: general AI is the ideal. But in reality, most companies don’t have the scale or budget to chase that vision. They neither need nor can afford the kind of massive infrastructure general AI requires.

So what’s the practical path forward? More companies are starting to favor smaller, purpose-built AI models. Many tasks will still rely on human decision-making, and investments in AI often come down to a trade-off between real substance and outward image. Right now, the return on investment (ROI) for AI tech often isn’t high—but it looks impressive. If your company appears more technologically advanced than your competitors, customers are more likely to trust you. In this phase of AI adoption, perception can matter more than performance.

Still, company leadership wants to know: what’s the ROI? Let’s say your system detects 10 defects in 10,000 products. If those defects slipped through, what would the cost be to compensate your customer? In many cases, a well-trained team of people could handle the issue without the need for high-end tech. So how much is too much to invest—and how little is too little? Striking the right balance is a constant tension in strategic decision-making.

Executives are asking: “Should I invest over a million in a server? Will it really pay off?” This is the reality companies face—balancing ambition, budget, and business value.

From that perspective, smaller LLMs are often the smarter choice. If one AI model can do five jobs, great—but you may need a massive, costly infrastructure to support it. Alternatively, you could use five smaller models, each specializing in a specific task. This distributed approach requires far less hardware and can be more cost-effective. Instead of spending millions to create a generalist model, splitting responsibilities across specialized models is both technically practical and financially wise.

Case Study: Manufacturing + AI in SEA

【Tap CC for English Captions】Clicks to Watch the Video

LLM for Domain Knowledge

Ian: Expanding manufacturing operations overseas is inevitably intertwined with global geopolitics—something that’s become especially evident with recent tariff disputes. In these situations, the ability to accumulate knowledge and effectively train employees becomes critical. It’s not just about setting up shop abroad—it’s about ensuring your organization retains and transfers expertise efficiently, particularly in culturally and linguistically diverse regions like Southeast Asia, where minimizing communication barriers is key. This is where large language models (LLMs) become extremely valuable, especially for knowledge management. When knowledge is systematically captured and organized, it can be accessed from any location. For instance, an issue at one factory may have already been solved at another. An LLM can help you retrieve past cases instantly and even summarize solutions—allowing new sites to respond faster and avoid repeating past mistakes.

LLM for Training

Ian: Beyond knowledge retrieval, LLMs can also support employee training. In addition to using LLMs, many companies now employ AI 1.0 tools like computer vision to assess whether workers are ready for the production line. These systems evaluate whether a trainee’s physical movements align with standard operating procedures—an area previously judged manually and often subjectively. With AI, evaluations are objective and consistent, ensuring that physical tasks meet expectations.

AI for Industrial Expert

Ian: While AI 1.0 handles movement-based training, LLMs address knowledge-based skills. We’re now developing what we call a “virtual expert” system. This system leverages LLMs to provide expert-level answers regardless of language differences. It’s like having a senior technician available at all times—only it’s digital. This reduces the need for in-person supervision, bridges language gaps, and makes technical support scalable. The same system can serve R&D teams, production lines, and overseas sites alike by being customized with domain-specific knowledge.

Because LLMs support multilingual inputs, workers can ask questions in their native language and receive accurate, localized responses. It’s as if each factory has a supervisor ready to assist, 24/7. Much like a traditional master-apprentice relationship, the virtual expert learns from curated data. But its reliability depends on the quality and volume of that data—poorly curated or fragmented information will lead to subpar responses. That’s why historical knowledge must be well-processed and fed into the model to develop into a truly capable “AI technician.”

Data Silos

Ian: As companies scale across Southeast Asia and beyond, data generation increases dramatically. However, if each site keeps its data isolated, you end up with data silos—pockets of information that are inaccessible to others. This fragmentation limits the organization’s ability to learn and improve. The solution is a unified “data lake,” where all sites contribute and can retrieve information, enabling company-wide learning and faster problem resolution.

Global Manufacturing Quality Issues

Ian: The second major challenge in overseas expansion is maintaining consistent product quality. When production shifts from Taiwan to Southeast Asia, the output must remain identical in quality—any variation by location is unacceptable. Here again, LLMs can help by analyzing data from different sites to detect subtle differences in processes. Small variations that go unnoticed by human supervisors can be identified through large-scale data analysis. By tracing the real influencing factors, LLMs help ensure global consistency.

In short, LLMs serve two powerful roles in overseas manufacturing expansion: 1. Breaking down data silos to enable knowledge sharing across sites. 2. Ensuring product quality consistency globally by identifying and correcting process variations

These capabilities make LLMs an essential part of the modern, globally distributed manufacturing strategy.

Read Success Case: PowerArena Supports Smart Factory Deployment in Southeast Asia.

Table of Contents

    Back to top