Synaptica

Insights Interview | Chantal Schweizer

Vivs Long-Ferguson — Wed, 09 Oct 2024 11:52:42 +0000

For this interview Lauren Clark Hill, our Customer Success Manager, talked with Chantal Schweizer, Practice Director, Strategic Data Services at Pivotree. Chantal assists various clients in enhancing their product information to achieve effective UX and data management. With 18 years of experience in the product data management field, Chantal has expertise in crafting cross-platform information structures, ontologies, and testing navigation. She is also a past recipient and judge of the TBCL Practitioner of the Year award. Throughout their discussion, Lauren and Chantal delved into the significance of product information management (PIM), the influence of training, and the process of introducing data management tools to others.

LCH: Tell us a little about you and your career evolution.

CS: Like many, I got into taxonomy completely by accident. I studied art history, but jobs in the arts were scarce. I started my journey at Grainger as part of the resourcing team and later transitioned to the product data team. While working on their Product Information Management (PIM) team, gathering product data from suppliers and normalizing that data, I discovered that I liked organizing the chaos. I spent approximately eight years collaborating on their taxonomy, managing attributes, gathering data, and implementing governance processes.

Then I moved to Schneider Electric with responsibility for PIM administration in one of their divisions. This gave me insight into data from the manufacturers’ side of the house complementing my knowledge of the distributor data I had worked with at Grainger. After a few years, I moved into consulting. I worked at Ideosity for about a year on PIM implementations, but I was still focused on the taxonomy. With implementations, I noticed that the focus was on the technology rather than the data. I was continuously trying to make sure that taxonomy was still coming into play. You can have a great PIM, but if you don’t have a great taxonomy, it’s not going to work.

Following this I joined Earley Information Science. I started as a junior taxonomist and worked my way up to senior taxonomist lead in the eight years I was there. I did sales for a little bit and then became director of the taxonomy team. Then I joined Pivotree as Director of Strategic Data Services. In this role, I worked with product data from the manufacturers and distributors, with their point of view. This perspective really helped garner trust with our clients. This perspective lets you see how different companies set up their data and the way they govern it, and helps you understand how people interact with it.

LCH: I find this background interesting. My first master’s is in art history, specifically history of decorative arts. A friend of mine who also works in taxonomy has an MLIS and an MA in art history. I have a belief that one of the reasons taxonomists do so well is due to this strong liberal arts research background. It feels like something we don’t touch on but is running underneath what we do.

CS: An arts background provides insight into how people view the world. We are interpreting a methodology, bringing art and science together to make a taxonomy, beautiful but functional, that works for multiple people. You want to make sure that a taxonomy is balanced, that there is flow, and intuitiveness can come into play. How people intuitively shop for their products, how they look through a taxonomy to browse for something are part of this. It’s all part of making an effective taxonomy.

LCH: I often joke that sometimes when I’m working on a taxonomy, it’s very structured and other times the method is to let the spirit flow through the keyboard and make it work. I find it interesting how many people with similar backgrounds have accidentally fallen into taxonomy.

CS: I’ve met a lot of people with music backgrounds and artistic backgrounds, like English majors, in this field. Also, there are a lot of people with MLIS and library science backgrounds. I was a librarian throughout high school and in college so I love swapping stories with my fellow librarians.. We (taxonomists) have a lot of similarities, there are a lot of commonalities and traits we share like a love of books and board games.

LCH: Tell us a bit more about Pivotree, the company and the work you do.

CS: I lead and guide a team of wonderful taxonomists who understand great product taxonomy and its best practices. Pivotree is big both in the technical space and the data space. They offer a road to frictionless commerce customer experience for our clients. Our systems enable manufacturers, retailers, and distributors to help their customers by streamlining the supply chain, data and eCommerce systems. Our team’s role is to make sure the data [and schema] is correct within these systems.

My team specifically works in this product data space. We make sure the data is successfully set up in a PIM or MDM system and ensure it translates through to the different eCommerce systems ready for customers and retail platforms. We make sure customers have all the data they need to quickly find their products and make a confident purchasing decision.

LCH: Can you explain what PIM systems are? How are they similar and different from taxonomies?

CS: PIM stands for product information management system. They are used as a central repository of the attributes you’re going to use to describe your products. The attribution taxonomy is the backbone of a PIM system. If you have a light bulb, your specifications are color, temperature, voltage, wattage. If it’s a ladder you have attributes such as maximum expandable height, material and weight capacity. These are technical aspects of the PIM that help you manage that data, ensure data consistency and data quality.

Then there’s the taxonomy side of it to consider. Taxonomy and the data can be looked at as a secondary thought in some PIMs but it’s a vital part of any PIM system. If you have a really great motor, but you’re running old, gunky motor oil through it then it’s not going to work as efficiently as it should. It works the same way for data in a PIM.

”If you have a really great motor, but you're running old, gunky motor oil through it then it’s not going to work as efficiently as it should. It works the same for data in a PIM.

Robust governance is necessary to ensure the data is managed efficiently. You need people who appreciate and manage the taxonomy and attributes. Having strong governance in place helps with this. Your data has to be clean, complete, concise and clear. If you don’t have these in place, then the PIM is not going to be as accessible as it could be. Make sure the health of your data is considered when you’re introducing a PIM system.

LCH: How was the transition from individual taxonomist to being team leader in the taxonomy sphere?

CS: I miss diving into a system and building a taxonomy. There’s something therapeutic when organizing the chaos. I don’t get the opportunity to do this as much as I used to. What I do love about my current role is being able to share knowledge and networking. I post videos on LinkedIn related to best practices. I have a series via LinkedIn called the Data Cafe where we talk to various people, clients, partners and professionals about product data.

My passion is to teach people about taxonomy. I have people reach out to me on LinkedIn asking for advice. Having the opportunity to talk taxonomy and finding other ways to help people advance their career, both internally with my team and externally with LinkedIn queries, means a lot.

LCH: What makes a good taxonomist? What do you look for when adding people to your team?

CS: There can be slight differences between external consultants and internal organizations but a few things remain the same. You want people who enjoy organizing, are detail oriented, and understand classification. They need to be able to work towards completion and be well versed in different types of taxonomies; be able to view the condition of the taxonomy, identify what’s broken and see the solutions. Additionally, they need to be able to understand working in a spreadsheet but have the knowledge of the other tools available.

In addition to these skills, as a consultant, you need strong client-facing skills. Taxonomists need to feel confident talking with clients, presenting taxonomies, and managing feedback. You may be required to work with a taxonomy that has been used for several years. It may not be fit for purpose and we need to socialize best practice to the client to gain adoption to our new taxonomy. Our role is to create a taxonomy that makes the client’s life easier. We all know change can be a challenge. People like to stick with what they are used to. This is where relationship skills are necessary. You need to be mindful, and able to direct them to a happy path to a new taxonomy.

Flexibility is part of this. You need to be able to identify potential problems and accept that if they don’t want to bend you have to present the risks and work with them to find an agreed upon path forward, even if that means bending best practice to allow them to walk before they run. There are always exceptions to every rule in the taxonomy best practice space. Be prepared to take them down a particular road and see where it takes us. Be prepared to fix roadblocks if necessary. Having that adaptability is very beneficial for a consultant. Being able to speak to the reason why we design the way we do and why it will help our client in the long run.

”Be prepared to take them down a particular road and see where it takes us. Be prepared to fix roadblocks if necessary.

LCH: What advice do you give to others when they are building a taxonomy project?

CS: This is a question I am asked a lot particularly from those who want to get into the taxonomy space. First, make sure you know the best practices. There are lots of resources available, great books like the Accidental Taxonomist and Taxonomies.

There are also great events, including face-to-face like KMWorld and the virtual Taxonomy Boot Camp London. There is also a great Discord channel called Taxonomy Talk. This is an excellent resource for bouncing ideas off of other taxonomists. There are opportunities to talk about conferences, learn about job opportunities, and explore new technologies that are available and emerging.

There are also a group of people I follow on LinkedIn who are evangelists and active networkers; like Jason Hein, Scott Taylor, Susan Walsh. There’s a lot of really great folks out there.

Looking at examples of good taxonomy, another helpful space is Baymard. A good space for taxonomy best practices. It’s a resource to see taxonomies, good and bad, different taxonomy levels, the facets, the attributes. There are useful avenues to learn about taxonomy, and to learn about private taxonomy specifically.

LCH: Governance is not really as complicated and scary as a lot of people seem to think it is.

CS: It really isn’t. You want to cultivate a governance culture but start small and work your way up. There are several important factors related to a process map, onboarding, attributes, modifying, audits and testing of the taxonomy, attributes and associated data. Consider all the roles that are needed and reflect on who is responsible. Who needs to be consulted and informed? Who are the owners of the different aspects of your data, who participates in those different aspects?

What are the steps for each process? Where’s the data coming from and where is it going to and making sure that you know who is responsible for each of those steps. Too many companies don’t have governance in place, and this makes data chaotic. Once you have these processes in place you need to make sure that data stays consistent over time. Style guidelines help you organize your formatting rules, capitalization, pluralization, special characters, and preferred terms.

You can have a governance council, a governance charter, or reporting meetings. There are a number of different things you introduce but start with that bare minimum and work your way up to those more robust governance programs. From the beginning, make sure you just know who’s in charge of what and what your processes look like.

LCH: I’m on a mission to evangelize on behalf of governance. It isn’t scary and style guidelines are important. It’s an issue I am speaking about at the next KMWorld event.

CS: George Firican is a great leader on this subject. You can read some interesting posts on his LinkedIn Channel on this subject.

LCH: What makes a good enterprise software? What are you looking for?

CS: When I’m recommending this to clients, what you look for in the software depends on the size and complexity of the data for that client. They may need flexibility to move data quickly. They may need to update data quickly, season to season, or manage different trends and keywords that are continuously changing in their world. There’s a lot of flexibility and speed of process needed.

For an industrial supply client, perhaps one who sells 1000s of actuators, they’re going to have a static product set – a robust attribute set with a lot more data to describe that product because it’s being used in technical applications. The item onboarding process is going to be a lot slower. Their needs are foundational, and they will have to handle big business rules. You need to consider the different ontological factors of it – product data, customer data, personalization and opportunities. Clients want to offer personal recommendations as well as access to customer order history or suggestions based on locality. Having that ontological set up will vary between different companies. We consider these different aspects when sourcing software for a customer.

LCH: You were recently awarded TBCL Taxonomy Practitioner of the Year – what was the project that led to that win?

CS: The award win related to a series of videos I made called The TaxoMOMinist. I created these videos in a previous role. I wanted to explain taxonomy best practice in simple terms with the help of one of my children, a teenager, to understand my work. He asked some great questions, and we would talk through examples: what was working and not working in a specific taxonomy. Unfortunately, they are not available anymore. I plan to revisit them in the future.

I was nominated for the Award by a former colleague. A lot of people watched the videos. They were being used as training tools. It was something simple but had a major impact. I am proud of winning, there aren’t many awards in the taxonomy sector. I have also been involved in judging future awards and it led me to further opportunities to speak at events.

LCH: What do you see as the positives and the challenges for industry?

CS: For positives, I would like to see expanded taxonomy testing. I’m hoping to find ways for an automated process to continuously test over time. Look at trends. Make sure you have dashboards in place that speak to the ROI of having strong data. This will give the ability to visually show how the data fill and accuracy rates go up and how they correlate with your conversion rates. What are your taxonomy success rates? Are you able to measure all of the different metrics over time to show that you have strong taxonomy and strong data? User adoption is going to continue to grow your sales and contribute to continuous revenue. This type of data success criteria is something I would love to see.

There is a big challenge in shifting people to move from taxonomies to ontologies. Ontologies help you connect taxonomies together and help with personalization and how to predict trends. I want to shift people from a taxonomy to an ontology mindset because of the benefits. We need to move away from spreadsheets to software like the products Synaptica offers. These tools are functional, efficient, and provide visual perspective. It’s hard to be visual in a spreadsheet when it comes to taxonomy. Improving the way, you present your taxonomy through the use of knowledge graphs, different metrics and measurables is effective and important.

Everybody wants a magic tool to create AI functionalities for them, but they don’t consider the importance of the data behind the scenes. A strong data set – clean organized data is necessary before any of these applications work. People don’t completely understand what’s available and how they can be used.

LCH: You talked about taxonomies and ontologies. Why is there reluctance? What is the barrier for making people embrace ontologies?

CS: It’s perceived as an additional layer of complication but overall, it makes life so much easier. Yes, there are multiple taxonomies, but if you can enable different taxonomies to link and talk to each other there’s so many important data analytics that you can access. You can access data and learn about your customers and different sales patterns. If you can find different data sets to help inform your choices, these can lead to big changes in how you operate. Understanding your customers, their personas, their buying patterns, plus understanding your product data, supply chain, distribution makes a big difference. This adds to the success of any business and takes it beyond return on investment.

”Everybody wants a magic tool to create AI functionalities for them, but they don't consider the importance of the data behind the scenes. A strong data set - clean organized data is necessary before any of these applications work.

Synaptica Insights is our popular series of use cases sharing stories, news, and learning from our customers, partners, influencers, and colleagues. You can review the full list of Insight interviews online.

The post Insights Interview | Chantal Schweizer appeared first on Synaptica.

In Conversation | Joe Hilger and Sarah Downs

Vivs Long-Ferguson — Wed, 21 Aug 2024 12:08:15 +0000

Sarah Downs, our Director of Client Solutions, talked with Joe Hilger, COO and co-founder of Enterprise Knowledge. Joe has over thirty years’ experience designing and implementing cutting-edge, enterprise-scale knowledge and information management solutions. Joe consults with organizations across the world and is a frequent speaker and instructor on topics including enterprise search, enterprise content management, knowledge graphs, machine learning, and explainable AI. For this conversation Joe shared his thoughts on the semantic layer, what’s next for AI, and the power of intellectual curiosity.

SD: Tell us about your career evolution. What brought you to your role at Enterprise Knowledge?

JH: I started in consulting after college back in 1990. I worked for a variety of organizations like Coopers & Lybrand, which is now PwC. It was there I stumbled into enterprise content management in the early 2000s, which led me to enterprise search. During this period, I met my future business partner, Zach Wahl. For a while, we worked and competed with each other. One day, we both thought we should start something. From that conversation Enterprise Knowledge (EK) was founded in June 2013. It has been a fun run ever since.

SD: What was that like taking that leap – leaving a very traditional corporate career to found your own company?

JH: It’s a neat story. I was running a branch office of a consultancy. We were small office but consistently profitable. We were doing quite well, and I had learned a ton from the company founders. I reached a point, though, where I needed to take the next leap in my career. And as I was beginning to explore this I thought: who is someone I have worked with who I could confide in and talk through what my next step should be? Zach Wahl knows a lot of people – I’ll call him. We started talking and both said: let’s give this a go.

At the start, we thought: “We’re so connected, this will be easy!” But we found out a lot about who our friends were, who stayed connected. For Zach, Dave Clarke was a big one. And he’s always remembered that.

It took us six months to secure our first client. Then came our second and third. Within 10 months of that first client, we were able to hire our first employee. Today, we are nearly 80 people, and every year we keep growing.

SD: What consistent growth for the EK team. This, of course, requires consistent hiring. What do you look for in your team members? What do you evaluate when you’re interviewing new team members?

JH: This is something that has always been important to us: How do we make sure we get the right people? Of course, we want people to know our industry, but there are some key traits that are really important to us. It’s these traits that we look for to ensure we have a high-performing team:

Intellectual curiosity I am proud when I watch our company compete with other firms. We recently spoke at a conference where we had three speaking engagements, and the audience came to listen to our team and the response was “wow, these people are so smart.” My response is yes, because they all want to learn every day. That’s the power of intellectual curiosity – the way you find people that make a difference.
Kindness. We are in the service industry, and we also want to foster a collaborative environment, so kindness is essential. We are proud that we have really embodied kindness as a company, almost to a fault at times. But at the end of the day, we are a service provider – our fundamental role is to help others. Individuals that are intellectually curious and naturally kind just tend to make the best consultants and are a perfect fit for us and our culture.
Finally, the last trait we look for is a little hunger. What I mean by that is someone who’s eager to prove themselves.

I’ve mentioned these in a very specific order intentionally – starting from the most important. And these characteristics need to be in balance. If a team member is too hungry, they won’t be kind, or they might take shortcuts.

During the selection process we do our best to evaluate these traits, above all else. How much does this candidate want to learn? Do they demonstrate kindness, do they want to help? And finally, do they want more? Do they want to grow?

SD: I used to be a consultant myself, and what you have described really matches my experience as well. You’ve described this triad or triangle of traits – and there will be some natural tensions across these characteristics. The need to deliver and strive while also being kind. Needing to be deeply curious about also delivering a specific scope within a time frame. But when you strike the right balance, it ends up making a great consultant, delivering great client service.

I know that client satisfaction is a real priority for EK. How do you measure client satisfaction and what systems do you have for keeping satisfaction high?

JH: For all our client work we hold regular project check-ins. These meetings are super important to us, both to leadership and the project teams, and we make them a priority.

The focus of the check-in is simple but powerful: come to us with your problems, and we’ll jointly solve them together. Often people don’t want to share the bad news. We’ve tried to create an environment at EK that fosters these conversations. We ask each meeting: What are you struggling with? This has created an expectation where people are excited to say “…the meeting’s coming up – I’ve got to tell them what isn’t working,” or “I really want to improve this element of the project, and I’m excited to generate ideas.” As we have discussed, we already have a culture of people who want to serve, and these check-ins give us the time and space to make our project teams successful and our clients happy.

SD: What an amazing way of institutionalizing a growth mindset and continuous improvement. It seems like you’ve been able to cultivate a “no blame culture” where people feel they can be open about what they want to do better or where they are facing challenges. And the whole team is focusing on: “how do we do better for our clients?” – keeping client satisfaction at the center. How does this play out in your customer relationships?

JH: A message we constantly hit home with our teams is: your job is more than just delivering. Your job is to truly understand the challenges our clients face and find solutions for these problems. We recognize that our clients have chosen to invest in EK, and our role is to invest right back in them. This is a part of the corporate culture, our mindset, how we want people to act. We call it the EK Way. It’s framed on our wall. This is one of the most important things we have created as a company, and it’s about how we treat each other and how we treat our clients. It’s one of the reasons our clients fight for us.

One of our favorite stories is one of our customers who moved companies. One of her requirements at her new organization was: “you have to allow me to hire the EK team.” When they brought her on, we were part of her strategic plan.

Our projects typically last from three to six months long, but we’ve had some clients whom we’ve worked with for six years. These long-term engagements are because our clients keep looking for new projects on which they want to collaborate with us.

SD: The client’s success becomes your success – you remain passionate about solving client problems, which allows you to help them address more, greater challenges. A nice, virtuous cycle.

I can also see through this how you arrived at your key hiring characteristics – you can see how that combination of hunger and intellectual curiosity leads to identification of mutual opportunities, but the kindness and empathy makes sure it’s all in service of the client first and foremost.

Can you tell us about the substance of your projects – what client problems are you solving?

JH: Knowledge management strategy is one of the hottest areas for us. We start with the idea that our services make sure that the right people get the right information at the right time.

”We start with the idea that our services make sure that the right people get the right information at the right time.

We’ve had multiple enterprise organizations come to us and say: “we need a strategy for how we manage all of our knowledge, data, and information.” The consulting side of our business works with their teams and their executives to develop a plan, a roadmap, and a set of projects to build and improve the way they capture, manage, and share information.

A lot of projects are looking at classic content challenges: information management, taxonomy management, and componentized content management. There has recently been a resurgence of search work. But regardless of the type of project, at the end of the day it comes back to getting people the right information.

We do a lot of work with Chief Data Officers (CDO). Many of these clients are struggling to manage their data warehouses. Our semantic layer services have been a great way to solve many of the common problems CDOs face today. The best way to describe the semantic layer is to think of it as a map between the way the business thinks about information and the way the information is stored.

We have been talking about it for a while, the analysts are talking about it and now companies are embracing it.

Knowledge graphs and the end-to-end tools that you are creating at Squirro and Synaptica help make this semantic layer a reality in a way that has never previously been possible.

SD: Yes it certainly has been a few years where you can’t get away from these buzzwords: LLMs, Knowledge Graphs, RAG. But we are at an exciting moment, I think, when we are moving past the buzzwords and into operational reality. How do you see your clients using knowledge graphs today and do you think that’s going to change over time?

JH: Everyone is embracing the AI bandwagon but rapidly realize “garbage in, garbage out.” The technology has changed but a core fundamental remains: how do we know that we have the right information?

The initial jump has been: “let’s create chat bots and let’s put the knowledge graph or let’s put the LLM in front of our search, so it gives back the answer that we’re all accustomed to.” That was the initial leap, and we see this in some of our search projects. But I think some of our most interesting work is when we consider the LLM, not as the endpoint, but as a partner in what we do.

Let me give you an example of what I mean. We are working with a very large financial institution, and they were managing risk. They needed a list of terms, basically a taxonomy, which would suggest risks. Since there was nothing to start with, they were going to use another professional services vendor, a Big 4 consultancy, to analyze thousands of contracts and start to build a list of terms. The project was based on a team of people doing this work.

One of our team felt this didn’t make sense, and he quietly, and with curiosity said, “you know, I think I could use an LLM to do this.” We said to our client, “do you mind if we try this”? Our team member went off and, after a week, he said: “I’ve got it working.” We had a demo, and it became a jaw dropping moment. They would have needed as many as 10 consultants to work on this project and the need for outside help nearly went away completely. The client was happy. The LLM did the research and analysis in a way that couldn’t be done before, and then someone could augment it from there.

I love that Squirro + Synaptica are in many ways at the forefront of this and of course they’ve done it through search and chat bots because that’s where everyone is working right now. But we know that there’s more to do here. Inserting a partner into part of this process is where we’re going to see value as companies get smarter.

”Everyone is embracing the AI bandwagon but rapidly realize “garbage in, garbage out.” The technology has changed but a core fundamental remains: how do we know that we have the right information?

SD: That’s interesting. We talked with Dorian Selz for another Insights Interview a few months ago. In that conversation he said something similar – chatbots are where people start with LLMs. It is not where people are going to end.

The chatbot functionality feels so human to people – it’s an incredibly appealing front-end interface but not always reliable without a knowledge graph back-end, and it’s limited in what business problems it can solve. What you’ve described so well is that the LLM functionality needs to partner with a human in the right way. LLMs do some things well – people do other things well. How do you combine them for a much better solution?

With a lot of the AI innovation people worry about their roles but here is a perfect example of how the technology can make us all better at our jobs; it’s not a lack of work, but higher, different work that the technology can enable. That’s exciting.

JH: We have been through this so many times during industrial and technology revolutions. This isn’t new. The Internet was going to eliminate jobs for many people, but what really happened is it changed the work people did.

Another example: we were interviewing people to join the EK team, and it was clear we weren’t asking the right questions. Our team developed a process where candidates submit a resume for a specific position. The LLM compared resume against job requirements and generate the candidate’s strengths for the position, potential weaknesses, and recommended interview questions written in the style that’s defined in EK interview training.

SD: You have given some great examples of how LLMs and humans can partner to great effect.

I know you are often advising clients on enterprise solutions and taxonomy management tools, even in this time of rapid evolution. What do you think makes a good enterprise taxonomy software in the current environment?

JH: Early on there were some differentiators. Auto-tagging was critical and then everyone wanted integration with tools like SharePoint/365. Those were kind of the essentials. Now it feels like most vendor providers have that. Now enterprises are looking for solutions to develop the semantic layer, which I mentioned earlier – that connective map that ties structured and unstructured information together.

It’s something that’s very appealing about the Synaptica acquisition by Squirro. Synaptica had all the enterprise-level capabilities for taxonomy and ontology management. Now, as part of Squirro, you’re not just getting one piece of the puzzle; you’re getting an extensible, larger piece of what’s needed.

At a recent large data conference, the question that came up multiple times was: “what are the products and the system architecture you need to develop a semantic layer?” We listed all the tools you need: data catalogue, search engine, taxonomy and ontology management system.

Most say: “That seems like a lot of technology.” But you really do need all these pieces. Clients face the need to purchase different tools or stitch together existing tools to create the full continuum. That’s not an answer they love because it creates a lot of complexity for the organization to manage. Companies today want to work with a solution that solves most of the problem. They don’t want a one-off tool.

The fact that we are seeing all these requisites in one software tool with Synaptica is really meaningful. At EK, we are excited that we are seeing consolidation of the stack in a way that lets us go to clients and offer them, for the first time, a solution through which they can access most of what they need all in one place. It’s so much better.

”At EK we are excited that we are seeing consolidation of the stack in a way that lets us go to clients and offer them, for the first time, a solution through which they can access most of what they need all in one place. It’s so much better.

SD: I come from a taxonomist background, so something I’m really proud of in our tools is how we offer no-code solutions that put great power in the hands of the taxonomist. Instead of owning one piece of the workflow and then being blocked by a need for data science or engineering resources, our taxonomist users can power 90% of a full auto-categorization flow – from taxonomy design to tagging, back to taxonomy design, in a full loop.

But you’re pointing out rightly that there are solutions that can really empower and ease the burden on the IT professionals and information architects, the knowledge managers who otherwise have to integrate all these individual systems via API. The end-to-end Squirro + Synaptica offering reduces this sizeable effort before the system can even begin to work.

JH: We have a development staff because we do these types of projects on behalf of clients. Our developers are always looking for new challenges. They will come to us and say, “we have to do this again?!” They want products that take care of the simple things so that they can add value with more complex solutions. I love that mindset because, again, our clients want to do more with us, so our team doesn’t worry about having the work. What they worry about is doing cool and interesting things. Building the same connector 20 times, doing the same work over and over again is not cool.

SD: I have always believed you don’t want your data scientist running queries for a taxonomist that could be a button in a product. You want your data scientist diving into all the rich data that the product can provide through cultivation of a knowledge graph. That’s where they should spend their time. That’s where they’re enthralled and where the business value comes from. You’ve described the quandary on the development side as well – what are those developers not doing because they have to build basic plumbing that should be productized and ready to go out-of-the box. Hopefully, your development team has a lot more exciting things to do in the year ahead.

JH: It’s so neat to build something new or do something incredible – something that’s meaningful for our clients. A lot of consulting firms forget this. They look for easy repeatable work that earn as much profit as possible. Our assumption is: if we do something that has an impact, our clients will always find more work for us. They want to work with our team at EK. It becomes a partnership.

SD: Truly partnership rather than transactional relationship. That’s wonderful. In addition to being a consultant, I know you host a podcast, you’ve written books, you speak at events – tell us about these opportunities to be a thought leader.

JH: Our Podcast is called Knowledge Cast and it is the leading podcast on Knowledge Management. I host a section of it focused on technology vendors. The goal of my section is to let our vendors talk in a positive way about their products. As far as I’m concerned, it can be a commercial, and the reason I say that is we want to offer our clients and listeners a chance to hear about all the great tools that are out there. We want them to hear about them from the vendors themselves.

I don’t know if there’s any thought leadership there other than getting smart people to talk about their products and share their passion. I see us as just a communication vehicle.

But with speaking at conferences and writing the book. That’s really been fun.

I reflect on this every once in a while. When we started I led a lot of our technical design work. But that’s changed. Most of the new ideas we have from a technical standpoint don’t come from me anymore. For nearly 10 years, we have had an internal knowledge sharing session that we run every two weeks with the team. It starts with company updates, but most of the time is spent on someone sharing something cool they have done with a client. These meetings contribute to our culture of people producing new ideas.

I’m becoming the person that talks about the exciting ideas that the EK team as a whole has generated. I’m not always the one who originates the idea anymore. In this way, I think of myself not as a thought leader but more as a communicator of thoughts.

We all teach and learn from each other. It’s a message I share with our clients. We talk about the power of EK as a whole. Our clients get more than the people on the project. We have communities of practice where all our people share ideas and knowledge (within the bounds of confidentiality and data security, of course).

SD: Thinking about the client partnerships that lie ahead for you, what do you see as the big challenges for the sector and industry that you’re going to be helping your clients solve?

JH: This brings me back to the semantic layer.

I know everyone’s focused on AI. Everyone seems to be running fast, trying to figure out what they are doing. We see a lot of companies building their own RAG solutions. They think they can do this internally and run into a typical pattern of problems. Depending on the size of the company, they launch something in 8 to 10 weeks. Some larger enterprise clients take longer.

The response is amazing when it is initially launched. Pretty quickly, though, the users start seeing problems like hallucinations and start to lose faith in the tool. Another four weeks go by, and they realize they should have worked with an expert. This is when companies like Squirro and EK get a phone call.

Product vendors are building robust RAG tools. They are spending time, resources, and tuning their solutions to build something strong, reliable, and trustworthy. Many of those clients that started on their own realize that they should have used experts who can ensure that the solution is fast, reliable, and trustworthy. In a way, this is very similar to what we saw with data lakes. Organizations said “we’re going to put all our data in a data lake that’ll solve everything. It’s all in one place so that we can find what we need.” They launched and quickly found that they still can’t find anything.

We need to start thinking of AI in a smarter way and not repeat the errors of the Data Lake era. In the next couple of years, as more companies understand how much it takes to do information management in the right way – in a way that isn’t limited by a structure that people are stuck with but rather an organizational structure that aligns with the way people think – as we move to this future, it’s going to be quite a ride.

”Product vendors are building robust RAG tools. They are spending time, resources, and tuning their solutions to build something strong, reliable, and trustworthy.

The post In Conversation | Joe Hilger and Sarah Downs appeared first on Synaptica.

Content-Aware Knowledge Graphs

Sarah Downs — Thu, 15 Aug 2024 10:12:29 +0000

This article was originally published in the MESA M + E Journal Spring 2024.

Abstract: Many organizations invest significant resources in tagging their content, with varying results for accuracy and comprehensiveness. Machine tagging offers both promise and peril: scaling tagging with algorithms is valuable only if machine performance can exceed human tagging. Extending document tagging into a content-aware knowledge graph through algorithmic categorization (using extensible analytics services, including LLMs), powered by enterprise taxonomies and curated by a human-in-the-loop, offers a step-change for the functionality of enterprise content: similarity indexing, recommendations, data audits, and data insights accessible to non-technical users.

Many organizations invest significant resources in tagging their enterprise content, with varying results for metadata accuracy and comprehensiveness. This content metadata then powers content discovery through browse, search, or recommendations.

But are today’s workflows doing enough to surface relevant content? With a different approach, can you put your content metadata to greater use?

Traditionally, tagging approaches rely either on human tagging or machine tagging with little partnership between the two approaches. Marrying the power of information science with data science and placing a human-in-the-loop to manage the process can power better outcomes.

Further, this same process can cultivate a content-aware knowledge graph, which can power more refined and relevant content discovery of enterprise content.

The blog below explores these themes further. Where is there opportunity for your organization to take content metadata to the next level?

”What challenges exist for human tagging today? Most enterprise workflows face challenges for accuracy, completeness, and scalability.

The limits of human tagging

What challenges exist for human tagging today? Most enterprise workflows face challenges for accuracy, completeness, and scalability.

These challenges arise because many common workflows rely on human readers who apply concepts from a controlled vocabulary (e.g., an enterprise taxonomy) to identify the topics described within a piece of content. Often, these controlled vocabularies will describe branded products and services, industries, and subjects. The human reader may rely on tagging governance or business rules (e.g., apply one tag from an industry taxonomy, no more than three tags from a product taxonomy, etc.).

Usually, these human readers are either content experts (e.g., the authors of the content) or a team member responsible for content publishing (i.e., an owner of a process but not an expert in the subject of the document).

In the first “expert reader” case, the resulting content metadata is likely to be accurate but incomplete. An expert knows the content well and is unlikely to apply an incorrect tag, but they are often not incentivized to tag the content completely. Their core job is not tagging content and they have better things to do with their time than apply every relevant tag to a document. They are likely to apply the most important tags and then move on.

In the second “process owner” case, the reader is more likely to apply a wider range of tags as this is a core port of their workflow and responsibilities. The content metadata, however, may still be incomplete, especially for longer documents, which require “skimming” by the reader. Further, the tags applied are more likely to be incorrect. The process owner is not a content expert and may be prone to errors and misunderstanding.

Ultimately, many organizations invest a lot of time and human resources in tagging enterprise content, but the quality of the resulting content metadata is questionable, and the cost tends to be high.

”Machine tagging of content offers both promise and peril; scaling tagging with algorithms is valuable for most organizations only if machine performance can match or exceed human-generated content metadata.

The risks and cost of machine tagging

Machine tagging of content offers both promise and peril; scaling tagging with algorithms is valuable for most organizations only if machine performance can match or exceed human-generated content metadata.

Relative to human readers, tagging algorithms are more likely to generate many tags, but these may be more “noise” than “signal”; some type of intervention or training is generally required to tune machine tagging algorithms for specific enterprise content.

Even large language models (LLMs), which seem to perform so exceptionally, are prone to highly convincing hallucinations. Further, at this point in their evolution, use of LLMs trained on large publicly available corpora can perform variably on specific proprietary content.

Whichever model or algorithm selected for machine tagging, use of machine techniques for enterprise content tagging generally requires skilled data science or engineering resources to train and retrain models or query data to understand model performance. This adds either expense or a process bottleneck as content teams need to rely on technical teams to support their use or improvement of machine-generated content metadata.

Marrying information science and data science

There is a blended path, however, that can harness the power of machine scale to apply curated controlled vocabularies for creation of high-quality content metadata.

There are multiple ways to achieve this, but any approach that achieves transparency and explainability will have the benefit of enabling content experts (e.g., content strategists, taxonomists, content authors, marketing managers) to manage the process with limited engineering and data science support. This expands the user community available to drive a process of iterative improvement that can ultimately exceed the performance of human tagging or machine tagging alone.

The core elements of a human-in-the-loop approach include:

Connecting enterprise taxonomies to text analytics services that apply the controlled vocabularies at scale as document annotations.
Enriching enterprise taxonomy properties to support machine tagging (i.e., expanding alternative labels to increase concept recognition, introducing context rules such as ‘must match’ or ‘must not match’ to filter inaccurate concept matches and thereby decrease noise in the content metadata)
Displaying outputs of machine tagging in a transparent and explainable interface.
Iteratively improving machine tagging through review of explainable document annotations and enrichment of the taxonomy in a continuous feedback loop.
Capturing quality metrics to track improvement: false negatives, false positives, recall, precision, F1 score.

Relative to human-only tagging, this approach can reduce cost while improving quality of the content metadata.

Relative to machine-only techniques, this approach has many benefits:

Rapid deployment with no extensive model training required
Faster implementation and rapid iterative improvement
Extended user community (no coding or scripting skills required)
Transparency and confidence in data

Cultivating a content-aware knowledge graph

The value of content metadata is increased further when content tags are stored in a graph database. In this way, content tagging cultivates a content-aware knowledge graph, which can support further content insight.

With a graph database storing content metadata, any of the following relationships can be stored, queried, analyzed, and visualized:

Which documents or content sets are most similar to one another?
How are specific concepts trending over time, language, or geography?
What concepts tend to co-occur in the same document?
Which content types are low performing in their representation of concepts in controlled vocabularies?

In this way, the content-aware knowledge graph can power data insight, better content recommendation systems, content and compliance audits, and many other use cases that are inadequately served through many standard tagging workflows today.

”The content-aware knowledge graph can power data insight, better content recommendation systems, content and compliance audits, and many other use cases.

LLMs and your knowledge graph

The approach described above can also support LLM-driven analysis of enterprise content. This is not ‘either / or’ but ‘both / and’. Pairing an LLM with the approach outlined above can drive the following benefits:

Boosted LLM inputs to generate higher quality outputs
Retrieval augmented generation (RAG) to leverage custom data
Improved interpretability of LLM outputs
Add embeddings for similarity search
Extract novel concepts (i.e., identifying concepts not already represented in enterprise-controlled vocabularies)

Conclusion

Most enterprises today allocate significant resources to tagging proprietary content, but this is not always time and money well spent. A smarter investment is to balance the human and machine approaches and cultivate a content-aware knowledge graph that can power similarity indexing, recommendations, data audits, and data insights accessible to, and driven by, non-technical users.

Many organizations would benefit from migrating to a hybrid approach to content metadata creation – harnessing algorithms to scale human-curated controlled vocabularies. The partnership of information science and data science can achieve precision and recall scores that exceed that of siloed approaches.

The post Content-Aware Knowledge Graphs appeared first on Synaptica.

Squirro Acquires Synaptica: A Strategic Fusion of Generative AI and Knowledge Graph Technologies

Jim Sweeney — Wed, 03 Jul 2024 13:30:10 +0000

Squirro, a leading Swiss-headquartered global SaaS platform specializing in enterprise-ready generative AI, search, and business insights, proudly announces its acquisition of Synaptica, a renowned US-based SaaS provider of enterprise taxonomy management and knowledge graph systems.

This strategic acquisition integrates Synaptica’s robust semantic graph technology with Squirro’s cutting-edge generative AI capabilities, creating a powerful platform for knowledge discovery, conversational search, and business process automation.

Enhancing GenAI with Knowledge Graphs

Generative AI technologies, such as Retrieval Augmented Generation (RAG), leverage the strengths of traditional information retrieval combined with Large Language Models (LLMs), enabling natural language understanding and generation. RAG has swiftly become a sought-after solution, empowering enterprise users to ask natural language questions, retrieve relevant content and data, and transform it into contextually informative responses.

“RAG alone does not ensure the completeness and accuracy of information for mission-critical enterprise applications. Squirro quickly identified that this gap can be bridged by combining generative AI with enterprise knowledge graphs and semantic technology,” stated Dorien Selz, CEO of Squirro.

Synergizing Capabilities for Enhanced Enterprise Solutions

Synaptica’s SaaS platform is utilized by a diverse global customer base, including major corporations and government agencies, to manage enterprise knowledge, encompassing taxonomies, ontologies, and knowledge graphs.

“Our customers span various industries, but they all share a commitment to high-quality information retrieval. They use semantic technology to build knowledge organization systems, standardize and enrich metadata, and optimize precision and recall,” said Dave Clarke, co-founder of Synaptica. Clarke will join Squirro’s leadership team as Executive Vice President, Semantic Graph Technology.

Synergizing Capabilities for Enhanced Enterprise Solutions

Key Benefits of the Integration

The integration of Synaptica’s semantic graph technology into Squirro’s generative AI platform will enable organizations to:

Enhance RAG Information Accuracy and Completeness: Ground GenAI processes in transparent, human-editable knowledge models.
Automate Business Processes: Utilize process-oriented taxonomies and ontologies to guide RAG decision-making, ensuring appropriate refinement questions and recommended next steps.
Improve Content Classification and Semantic Search: Use single-source-of-truth enterprise taxonomies to provide consistent quality metadata across multiple content repositories.
Discover Latent Knowledge and Business Insights: Continuously enrich an enterprise knowledge graph that infers new knowledge from the semantic associations and similarities of entities.

“Synaptica completes Squirro’s capabilities, and Squirro completes Synaptica’s,” said David Hannibal, Chief Product Officer and Head of Corporate Development at Squirro. “Even before finalizing this acquisition, we began collaborating with customers eager to combine our respective product offerings.”

This acquisition marks a significant milestone for both companies, positioning them at the forefront of the rapidly evolving landscape of AI-driven knowledge management and business process automation.

For further information, please contact:

David Hannibal

Chief Product Officer and Head of Corporate Development, Squirro

david.hannibal@squirro.com

Read the Announcement

About Squirro

Squirro is a global SaaS provider headquartered in Zürich, Switzerland, offering enterprise-ready generative AI, search, and business insights solutions. Our mission is to empower organizations with the tools they need to harness the full potential of their data through advanced AI technologies.

About Synaptica

Synaptica is a US-based SaaS provider specializing in enterprise taxonomy management and knowledge graph systems. Our solutions enable organizations to manage and leverage their knowledge assets, enhancing information retrieval and business process automation.

The post Squirro Acquires Synaptica: A Strategic Fusion of Generative AI and Knowledge Graph Technologies appeared first on Synaptica.

Insights Interview | Joyce van Aalten

Vivs Long-Ferguson — Wed, 29 May 2024 09:07:18 +0000

Joyce van Aalten is an independent taxonomy consultant and trainer. She has 15+ years’ experience with taxonomy, thesaurus, ontology and knowledge graph projects for a broad range of customers. For this Insight Interview with the Synaptica team, Joyce talks about her experiences as an independent consultant and explains her concept of minimal viable taxonomies.

Tell us about you? What led you to taxonomies?

JVA: I live in the Netherlands, married with two kids, 14 and 11 years old. My head, my thought process is around how you structure things. I love language. I love semantics and the meaning of words.

My children and my childhood have definitely influenced the way I work today. As a child I created my own library. All my books had a mark at the top, a number, and a small removable card inside the cover. Today, I enjoy teaching my children how a language is structured, what are synonyms, what are antonyms. If you combine these parts, you get a taxonomy.

”My head, my thought process is around how you structure things. I love language. I love semantics and the meaning of words.

I started my career in library science studying in the Netherlands. After I graduated, I worked a couple of years as an information specialist and desk researcher. Weaving information was part of my job. This evolved into facilitating and finding information. I thought how interesting it would be to give advice on findability. My first experiences of taxonomy were about findability. These are the examples that I use now when I talk about taxonomy projects.

Later, I joined a consultancy agency in the Netherlands. That was my first real first experience as a consultant with a special interest in information management, but also working with taxonomies. Subsequently the consultancy focus changed towards the IT sector, but I wanted to stay with the taxonomy part. I decided to work as an independent consultant and created Invenier. This was about 14 years ago.

You have been a solopreneur for almost 15 years. What’s it like running an independent consultancy?

JVA: It’s fun and I get to work “my way.” Independence gives me the freedom to work on the projects I enjoy and develop the methods to deliver them. There is variety to the work I undertake; nothing is routine, and I enjoy embracing change.

I often work on several different assignments at the same time. Often, I’m juggling between different projects. They can get mixed up, overlap, but this can be a good thing. Added value comes from the experience of diverse work, different organizations working with a variety of tools and domains. Mixing and blending the experience of multiple projects. You might be using a client’s content management system, which could be great for another project.

I’ll say my work never stops. I often get my best ideas when I’m doing practical things, folding the laundry or taking a walk.

Tell us a bit about the type of projects you work on.

JVA: My projects often vary in range and size. A small project might involve an organization already using a thesaurus or taxonomy. They need confirmation and an expert review. Are we doing the right things; are we on the right track? I might be asked to review the viability of taxonomies.

This can grow into workshops or training. Students come in, and I teach them how do you create that taxonomy? This can be a challenge with limited time.

Larger scale projects tend to take longer. For example, working with content strategists to create a taxonomy. They may require migration from one content management system to another, which requires preparing the content before it can be absorbed. Enterprises often have ambitious aims, for example: omnichannel strategies. The taxonomy has to be part of this.

Creating a taxonomy that aligns to these approaches needs to involve input from members of the content team. I team up with content owners and managers because they have insight into what content goes into the system, what needs to be excluded, and what you focus on. They might be involved with tagging. It’s important to keep them involved throughout, and make sure they are supported.

It can take time as there are all the different processes to consider; cleaning the content; preparing for migration and implementing the taxonomy content management. This is where taxonomy tooling can come in and make a difference especially with multiple content management systems.

Has the kind of projects you work on changed over time?

JVA: It has and it hasn’t. The basics remain the same; they haven’t changed that much. We are back to my children’s library – we still need to label, to categorize. In these terms, projects remain the same.

There are changes and progression in the sense of new technologies: the impact of AI, taxonomies in relation to LLMs. What hasn’t shifted is we are working with legacy systems, content divided into several content repositories. In addition, we are working with people who are not always quick to adapt; there can be reluctance. Other factors that haven’t changed relate to tight budgets, or limited processes or structures.

That’s why my answer is yes, there is drastic technological change but some things remain the same: the people and the process.

What do you enjoy most about your project work? Is there an element that you really relish?

JVA: Definitely the creation of the taxonomy. During this part of the process, I can dig deep into the content and switch on my taxonomist nerd mode. It’s a bit like entering a small room, closing the curtains and concentrating hard.

The other part I enjoy is the collaboration with others. One of my goals is always to enthuse others about taxonomies. I want people to enjoy the work and get more out of it. It’s important to be an advocate, an ambassador for the practice.

Are there any particular challenges that you often solve?

JVA: There are common traits like solving findability and retrieving content. Working with taxonomies is a bit like working under the hood of a car. Taxonomies are like the engine in a car, hidden and not visible. When you start driving you don’t always think about the complexity underneath. You just want to drive.

”Working with taxonomies is a bit like working under the hood of a car. Taxonomies are like the engine in a car, hidden and not visible. When you start driving you don't always think about the complexity underneath. You just want to drive.

Where do you start with a taxonomy project?

JVA: Sometimes clients have a sense they need a taxonomy, but they are not sure where to start. Can it work for them? The first step with most projects is to begin with a content audit. You look at files, what’s being shared, folders, the structure of the content. Look at the different systems involved, the document management system or other content repositories. But you have to set limits with this process, otherwise you really can get stuck down a rabbit hole. There needs to be scope and set dates. I recently discussed this issue and the idea of using a minimal viable taxonomy (MVT) at Taxonomy Boot Camp.

Developing a perfect, precise, but large enterprise taxonomy can be a challenge. Starting with a minimal viable taxonomy is good for an organization that isn’t sure what or how the taxonomy can be leveraged. They have to figure out how the taxonomy can work for them. Starting with a smaller scale version gives them a feeling for how it works, how to use it and how they can grow the taxonomy further.

With an MVT you scope the taxonomy to the content of one content repository in which the taxonomy will be implemented. This MVT will be stored in a spreadsheet and in the content repository it is used for. The MVT should have hierarchical relationships and synonyms. These can be expanded with associative and other semantical relationships when the MVT grows towards a full taxonomy.

What would you look for in a taxonomy tool?

JVA: Ease of use. I am used to working with all kinds of taxonomy tools but in working with people who are not taxonomists at heart, I have seen that useability is important. Taxonomies often start in a spreadsheet or perhaps a mind map. Tools can offer functionality for supporting workflows, creating the structures, storing audit trails, and allowing for migration. These are things that can’t be done in a spreadsheet. You can’t collaborate with multiple users tracking what’s going on, how it’s being used.

Do you notice any difference in the ways taxonomies are used in Europe versus the US or elsewhere?

JVA: Most of my project work is based in Europe but in recent years I have been working at an international level. Overall, they are the same despite various locations – the approach or the technology remain the same. It’s only the language that is different. Different or multiple languages might result in a different development approach, however.

What makes a good taxonomist?

JVA: For me, it’s not the skills or competencies as much as you may think. Your educational background can be any subject, any field, but it’s the mindset that to most. The desire for structure. You can study library science but it doesn’t need to be a prerequisite to be become a good taxonomist. Taxonomists like organization, structure, and have a natural instinct towards being tidy.

What general advice would you give to someone developing a taxonomy?

JVA: Organizational needs can be different from one organization to another. Try to find the biggest pain point but be prepared to start small. It needs to be feasible – something that can be easily developed or implemented. Usually, when working on a new project, or a new content system, my first thoughts are to find a problem, develop the solution, and show the value.

”When working on a new project, or a new content system, my first thoughts are to find a problem, develop the solution, and show the value.

Why do organizations need taxonomies?

JVA: Taxonomies and ontologies can be leveraged in so many ways. You can enrich your search, utilize the use of content, and develop reasoning. You can do so many things. Organizations generally see the immediate value for taxonomies when they improve retrieval. They help you find your content. The taxonomy becomes the glue between all the content and data. If you have a tin can without a label, you don’t know what’s inside. You will need to open the can to see the content. Taxonomy and metadata are important today because machines don’t understand what’s inside, and we still need to tell them what the content is.

Synaptica Insights is our popular series of use cases sharing stories, news, and learning from our customers, partners, influencers, and colleagues. You can review the full list of Insight interviews online.

The post Insights Interview | Joyce van Aalten appeared first on Synaptica.

In Conversation | Dr. Dorian Selz and Sarah Downs

Vivs Long-Ferguson — Tue, 30 Apr 2024 09:05:21 +0000

For this interview we invited Sarah Downs, our Director of Client Solutions, to talk with Dr. Dorian Selz, the Co-Founder and CEO of Squirro. Prior to co-founding Squirro, Dorian founded the Swiss search platform local.ch making it a market leader within four years. Prior to this he was a Partner and COO at Namics AG – the largest eBusiness consultancy in Switzerland and Germany.

Dorian holds a PhD from the University of St. Gallen and a Master’s in Economics from the University of Geneva. During the conversation Sarah and Dorian discussed his career evolution, Squirro’s embrace of LLM (Large Language Model), AI technology and the emerging relationship with Synaptica.

SD: Your career path demonstrates you are very much a serial entrepreneur. Tell us about your career evolution to this point.

DS: I’m from a small farming village in the middle of nowhere, a remote part of Switzerland. I had the opportunity to study in Geneva and it was there I was involved with a student’s organization, the European Student’s Forum. This opened up the world for me, including the chance to study for a PhD. Then I studied in Scotland, in Aberdeen, and completed my PhD at the Business School Universitat St. Gallen.

It was at this time, the mid-90s, that we founded our first company supporting web sites: Namics AG, an eBusiness consultancy with a strong presence in Germany. We provided classic professional services work – designing the first web pages of companies like UBS and Siemens – working through specific customer engagements or projects. It was one of those projects which led to my next company, local.ch. This grew to become Switzerland’s largest homegrown website and search engine. Swisscom had a majority stake in the company, and it was later sold.

My next startup was an online note-taking tool similar to Evernote. We tried to raise venture capital funding, which was challenging in Europe compared to the US. This type of product was popular and we were very competitive. But we learnt a hard lesson. Anyone, Evernote, us, and other Freemium proponents never achieved the levels of paying users required to sustain the business model. You can’t survive as a restaurant if your patrons eat the food but don’t pay for it.

We took the opportunity to reinvent ourselves and that’s where Squirro comes in. We took what we had effectively done in the past and twisted it, applying it to Enterprise Search.

”Making sure the appropriate information comes to you at the moment you need it, in the context you need it. That's the vision that makes me get up in the morning.

SD: I’m interested in the way you’ve described your next business coming out of something that’s happening in your current business. What was the moment of realization you had another potential business in Squirro?

DS: It’s about information. Making sure the appropriate information comes to you at the moment you need it, in the context you need it. That’s the vision that makes me get up in the morning. If you take the information coming to you when you need it, this implies a few things. You don’t want everything to come at you at once; that would be too much. The aim is for the appropriate things to come to you when needed. What you want is the appropriate information at the appropriate moment in the appropriate quantity.

”What you want is the appropriate information at the appropriate moment in the appropriate quantity.

SD: Your team at Squirro has grown to over 30 people. What do you look for when you’re hiring new team members?

DS: I want to collaborate with people I can hang out with for a long time. I’ve been with my two co-founders for the past 25 years. It’s been the greatest privilege of my professional life. Some of the people we have built this company with have been together for an exceedingly long time.

There is a famous book Good to Great by Jim Collins. What he said is you need to get the right people on the bus, get the wrong people off the bus, and then decide which direction to take.

SD: Interesting. You start with the people and only then find the problem.

DS: The moment you’re in business pursuing an ambitious idea, it’s more like an expedition. As an expedition you’re together for months on the way to Mars. NASA selects the teammates that go together for a space mission. It’s important that they want to go, take part in training, be physically fit, and technically brilliant. But that is not the defining criteria. What they look for is team composition. In space, if something goes wrong, can these guys figure it out together as a team? Can they rely on each other? This doesn’t mean that you’re the best friends ever. But you need to work it out together.

SD: In 2023 Squirro embraced LLM AI technology, and rapidly developed solutions for Retrieval Augmented Generation (RAG). Can you share with us your overall vision for LLMs?

DS: Large Language Models (LLMs) have been around for a while, before ChatGPT. This type of technology has been created through open innovation introduced a few of years ago, and eventually led to the creation of companies like OpenAI. These LLMs are thoroughly trained neural networks, which sounds quite complicated and fancy, but at the end of the day their function is quite simple – they probabilistically predict the next word.

Once you understand this, you understand where the strengths and weaknesses are. Their strengths are immediately recognizable: after extensive training, they can, out-of-the-box, recognize structure in text. But out-of-the-box, LLMs also have pretty big disadvantages. First, there’s a probability they often compute incorrectly – these are “hallucinations.”

The second disadvantage is a bit counterintuitive. Large Language Models are actually pretty bad at ingesting substantial amounts of data at high speed. Today it is very costly and takes time to train a Large Language Model. This is not something that works in a day-to-day, fast-moving business context. The volume of enterprise business demands would be an issue – a telecom provider has literally thousands of customer tickets a day. It’s impossible, almost useless, to try and manage this through LLMs today.

The third issue relates to the disregard for enterprise security. As a company, we immediately, and intuitively, understood that LLMs were going to be a game-changer. And that the LLM’s drawbacks are effectively the strong points of an existing enterprise search engine. But there is a lot of work to combine these two approaches and build something that you can deploy at the enterprise level.

That’s what we’ve done during the past 12 months. The reality is: AI was a minority sport a year ago. Now, no one can deny its impact and reach with the advent of these innovative technologies. The combination of techniques allows you to create business applications that can have massive economic impact.

SD: What do people get wrong about LLMs and AI right now?

DS: My analogy would be about the advent of social media platforms. When they launched everyone was expecting that Utopia had arrived – we are all going to be one big loving group worldwide because we have all these connections.

Turns out that these methods of connecting and communicating have also developed into platforms for hate speech, negativity, and toxic online behaviour. Are these social networks doing good for the world? In the last 20 years we have experienced civil wars, terrorist attacks, and online abuse. It’s a fair question to ask: are these inventions that went wrong? Either way, the technology’s use and impact turned out to be vastly different from our expectations.

In another analogy: I remember 25 years ago we did the first test with eCommerce and everyone told us: “No one is going to spend that much; no one is going to order stereo equipment online; there is a maximum to what people will be willing to spend over the internet.” The same with fresh food: “No one will buy vegetables online because of the logistics involved. It won’t work.” These predictions have been proven to be wrong.

When it comes to LLMs, I don’t think the chat application is going to be where this technology has the greatest impact. It’s the focus of a lot of time and energy right now, but it’s not where LLMs will end up in 5 years or 20 years.

”When it comes to LLMs, I don’t think the chat application is going to be where this technology has the greatest impact. It’s the focus of a lot of time and energy right now, but it’s not where LLMs will end up in 5 years or 20 years.

SD: Where do you think the greater promise of LLMs lies?

DS: Let’s orient ourselves with a simple view of the enterprise value chain – product is created and then sold. In this context, chat interaction on the sales side is just a tiny piece of the overall business function. In all the other parts of that process chain, especially for mid to large sized companies, you have massive amounts of activity. And within this you have many, many underperforming processes. The way companies operate today may be efficient in today’s world, but with the advent of LLMs you can 5x, 10x improve these processes.

I think LLMs are going to support a massive change in the way businesses are run. The real value will be the ability to optimize entire process changes that you couldn’t before. The euphoria around LLM-driven chat applications will settle down. I’m curious to see how many large organizations can navigate that maelstrom.

There will be a few people that will manage the transition. You will see the emergence of unexpected winners in that race. We’re also going to see transformative reconfiguration of entire value chains. You’re going to see reconfiguration of the way we buy books or organize our next trip away. The way we organize the production of whatever we produce.

”What Dave Clarke and the team are delivering at the moment is a real breakthrough. Hardly anyone has mastered the business integration element – knowledge graphs as a way to describe business processes - but the team at Synaptica do.

SD: As you prepare yourself for this new world, how do you see Synaptica products enhancing Squirro solutions for your customers?

DS: What Dave Clarke and the team are delivering at the moment is a real breakthrough. Everyone thinks of a Knowledge Graph as a Knowledge Graph – it’s a good place to start. Hardly anyone has mastered the business integration element – knowledge graphs as a way to describe business processes – but the team at Synaptica do.

Let’s discuss an example: we spoke with a mobile telecom company. They frequently launch new product names for the same basic mobile plan. One is called Safe Plus, the next Safe Plus + One. Then we have Talk Cheap, followed by Talk Cheap Unlimited. And if you’re talking about technology back-ends, this multiplicity of naming becomes even more complicated: you have Cisco router 5EF-B, Cisco router 5EF-C etc.

This is where structured data is so powerful for organizing varied and complex information, and when you combine structured data with LLMs you get Retrieval Augmented Generation (RAG). Now you have a transformative stack available. You can immediately transform the entire customer support system. You solve the complexity of change – you can reorganize the whole business process in an efficient way.

The real thing that I want to achieve through our partnership with Synaptica is a transformed ability to respond to a service ticket or call. The individual responding to the service request can search through different systems, look up resources and find the solution. Let’s transform that role.

Look at air traffic control. You are dealing with multiple planes arriving at an airport, and they all require landing space. There is a danger that planes fly too close together. The Air Traffic controller can see who is arriving in sequence on a screen, and they can orchestrate the various flights. The systems then collaborate directly with the relevant stakeholders, ground staff, luggage transport. All the people involved in the landing process.

This same approach could work for our customer service person. They orchestrate the various inbound tickets or calls but their role becomes a conductor of business processes. It’s a fundamental change.

SD: I like that customer support example, because we see our customers building a lot of taxonomies around managing customer support, like classifying product and service issues. I think it’s an area where a lot of enterprises have focused on taxonomies. It’s also a place where I think people tried to immediately adopt the chat application with mixed results. There seems to be this obvious fit between LLMs and customer service provision. But you’re really imagining a much more mature, integrated approach: getting people the right information at the right time and place – using the power of LLM and the power of search and the power of taxonomies to get there – letting the machines do what they do well and empowering humans to do what they do better.

DS: Yes, you immediately transform the entire customer support system. You solve the complexity of change – you can reorganize the whole business process in an efficient way.

The post In Conversation | Dr. Dorian Selz and Sarah Downs appeared first on Synaptica.

Insights Interview | Lauren Clark Hill

Vivs Long-Ferguson — Fri, 01 Dec 2023 11:13:00 +0000

Lauren Clark Hill recently joined the Synaptica team as Client Solutions Specialist. For this interview we discussed Lauren’s experiences, what it’s like to be a Synaptica client and a hint of her new role. Lauren Clark Hill comes to Synaptica with three years of experience working with taxonomies at Meta and LinkedIn and ten years of metadata management experience at institutions including Yale University Publishing and the Smithsonian.

Tell us about your early experiences and your move into the taxonomy sector?

LCH: I grew up in the town where I live now, Cape Girardeau, Missouri. I’ve moved away and returned back several times for education and work. But this is the town where I grew up in Southeast Missouri. My husband is from here as well. We have been together for 19 years, married for 8 and have 2 young children. My brother and his family still live here, our children get to play a lot. It’s one of things that keeps me here rather than being based in a big city again. My mother moved abroad in 2012. She teaches English at international schools. She’s worked in Korea, China, Hungary, and now she’s based in Malaysia.

As a kid I was the one who wanted to do everything. Lots of clubs and activities in high school. I am very much someone who has always been very, very busy. After High School I moved to Northern California for my undergraduate studies. Then returned to Missouri, worked in special education for a year as well as working as a cheerleading coach. Another move to Washington, D.C. for my first master’s program in history of decorative arts. I studied textiles and costume but also gender and racial studies of the American South. While I was based in D.C. I first started working with data and metadata. I worked a lot in museum collections ensuring that the items in the collections were properly recorded and catalogued. Museum systems are not always the best in terms of design and functionality. It was kind of a trial by fire to initially learn on their systems.

In between getting married and starting a family I worked with museums, non-profits, education and community groups. I shifted into consultancy roles working in digital marketing. During all this I improved my knowledge of databases and analysis. There’s a university locally and they regularly posted library roles. One of the requirements was an MLIS or MLS qualification, which led to me studying my MLIS in 2019.

”Part of the application process included a taxonomy exercise. This is standard for this type of position. When I looked at the exercise it was like that lightbulb moment – oh this is it. This is what I have been looking for.
Working in technology and taxonomy was very much one of those like coming home moments.

A friend of mine who works for Google as a UX researcher suggested I look at taxonomy-based roles. Then I saw a role with Meta working with ads for the taxonomy team. Everything came together perfectly. Part of the application process included a taxonomy exercise. This is standard for this type of position. When I looked at the exercise it was like that lightbulb moment – oh this is it. This is what I have been looking for. Working with terms, information and categorizing data, metadata. This is how my brain works and thinks. I stayed with Meta for some time and then joined LinkedIn. Personally, working in technology and taxonomy was very much one of those like coming home moments.

What are you most looking forward to about joining the Synaptica team?

LCH: When I worked at Meta, we used Synaptica software. This was my introduction to working with a technology provider and their software. I was a member of the team supporting the development of the mapping tool that Synaptica produced for Meta. This was the first time I had the opportunity to work with developers. I needed to explain what we needed and work on every step of that process. I was impressed with the responsiveness and the willingness of the Synaptica team to listen and work with clients. The ability to bounce ideas back and forth and shape this great tool. Synaptica quickly earned a special place in my heart. They set the standard of how a software company should work. This is what customer service should look like.

When the opportunity presented itself – would I be interested in joining the team? I was instantly attracted because of the positive experience as a customer. It’s also fascinating to be on the other side, in a position of support and help. Facilitating end users identify the solutions that they need.

”I was impressed with the responsiveness and the willingness of the Synaptica team to listen and work with clients. The ability to bounce ideas back and forth and shape this great tool. Synaptica quickly earned a special place in my heart. They set the standard of how a software company should work. This is what customer service should look like.

Can you tell us more about your new role with Synaptica?

LCH: I will be wearing many hats supporting our new product Graphite Knowledge Studio and, to some extent, our main product Graphite. I will be working closely with Dave Clarke and Sarah Downs on client solutions .

One part of my role I’m particularly excited about is to work on Synaptica YouTube channel and activity. I will be producing learning videos explaining features, how to guides, walkthroughs related to our services and products.

Overall, I’m enthusiastic about this aspect of building client resources and support tools. I will also be responding to clients’ requests and questions – providing day-to-day assistance and advice. I wasn’t able to join the team at KMWorld this year, but I look forward to taking part in this and other community events and conferences in the future. Sarah and I are also developing a series of Synaptica Webinars.

What do you think makes a good taxonomist?

LCH: For me it’s a combination of logic and intuition. Understanding industry standards when constructing taxonomies like ANSI-NISO Z39.19 and ISO 25964 can help. Developing a taxonomy is as much an art as it is a science. Over time taxonomists gain experience when they need to use standard methods, and when to depart from or augment them.

You know that you can go back and tweak elements. You can include terms, even though it doesn’t quite necessarily work logically. I have often jokingly said “every now and then when you’re working on taxonomies you just have to go with it”. A taxonomist needs that balance.

Why are taxonomies important to enterprises?

LCH: Enterprises want to find the right information when they need it. When you are managing vast amounts of data an effective taxonomy supports this. It helps you find the material that you require, quickly and easily. Your end users are never going to be able to fully use the breadth of your information if there is no taxonomy structure available. Taxonomies help an organization activate their data. This can be amazing, groundbreaking. Data becomes useful when it can be found. Otherwise, massive amounts of information without a taxonomy are simply piles of data.

Enterprises do need to establish who is going to be using this information and how they want to access it. I firmly believe the taxonomy has to resolve these issues. Categorization and tagging can also help, they provide a sense of where things are. Where they are located and can be accessed.

”Taxonomies help an organization activate their data. This can be amazing, groundbreaking. Data becomes useful when it can be found. Otherwise, massive amounts of information without a taxonomy are simply piles of data.

What general advice would you give to people developing a taxonomy?

LCH: First and foremost, determine why are you building this taxonomy. With the same concepts, we could build five different taxonomies depending on how individuals are to use it. Understanding and establishing clear goals is part of this process. I always recommend in-depth analysis, perhaps through brainstorming before people begin to work with the actual taxonomy.

Who’s going to be using this.

What do they need to find.

Use the question words that we work with. Identify these goals first, then it will guide you into how you’re going to build your taxonomy. The next step can include the process of reviewing your guidelines document for this specific taxonomy, or even this subset of this taxonomy. You might want to add extra information and consider the levels of you want to explore; 5 levels deep, it might be fine at 3 or even 2.

What do you look for in a taxonomy tool?

LCH: Reporting is one priority for me. Synaptica products have strong reporting features. It makes it easier to review the taxonomy and clean things up. You can draw a report to check on progress, be able to visualize the information in a different way. You can look for duplicates. Having that kind of structure makes life much easier.

What do you think are the biggest challenges for the industry for the sector in the future?

LCH: It’s going to be related to generative AI. Right now, there is a lot of excitement around adoption and working with large language models. The major challenge will be finding the balance between using new technology and still ensuring you have sufficient human interaction with the product. I understand why people are excited to jump on, but AI is still not quite where people think it is. It’s not a replacement tool, it’s a support tool. It can help you. You still need to have some kind of human review, and more importantly, LLMs need taxonomies and ontologies to give them the domain context to provide relevant focused results

In this ever-changing technological landscape, what do you think are the opportunities that enterprises should address?

LCH: Addressing how technology will improve employee capabilities and performance versus choosing tech because it’s new and shiny. There are companies investing in amazing, super intelligent people and then they’re not providing them with the resources that they need. I think that it comes down to investing in people and enabling them to keep innovating and pushing the boundaries.

Tell us about developing resources to support taxonomy tools?

LCH: People come to taxonomy from many different backgrounds, so resources and documentation can pose a real challenge when starting out. If the resources rely too much on jargon and an assumption of similar base knowledge, then you run the risk of shutting out minds that can add insight and topical expertise.

Ensuring that resources cover the range from beginner to advanced, include quick reference documents like FAQs and glossaries will serve to enable and support users without creating barriers. This also is a great reason to ensure that all support resources are accessible – whether that’s alt-text for screen readers or simply ensuring that there is sufficient color contrast in any graphics.

I am very much an advocate of well written documentation for any software, but especially for taxonomy software.

”Successful adoption of a software tool needs to provide a reward for its users, whether that's improved productivity, improved modelling capabilities, improved reporting or depth of analytical insights, improved collaboration and workflow, or simply pulling together multiple tasks into one tool that previously required working in multiple environments.”

How do you ensure individuals and organizations adopt and use taxonomy tools?

LCH: I’m a huge believer in communication – when people know what a tool is capable of and, importantly, how to use that tool to accomplish their goals, then they are far more likely to use it. When a tool is presented in an overly complicated way or in such a way that users don’t see an advantage to using it over their current methods, then it will just sit, unused, in their virtual toolbox. Successful adoption of a software tool needs to provide a reward for its users, whether that’s improved productivity, improved modelling capabilities, improved reporting or depth of analytical insights, improved collaboration and workflow, or simply pulling together multiple tasks into one tool that previously required working in multiple environments.

The post Insights Interview | Lauren Clark Hill appeared first on Synaptica.

Breakthrough Moments in Enterprise Taxonomy Management

Jim Sweeney — Wed, 29 Nov 2023 16:59:06 +0000

Earlier this month at KMWorld in Washington DC I got to meet many of Synaptica’s clients, technology partners, as well as some new faces from the community. It’s always a pleasure to attend KMWorld because of the opportunity to connect in person, and this year was no exception.

I think it’s fair to say that LLMs dominated the discussions. We as a taxonomy community are facing rapid technological change and grappling with where our discipline fits into a world captivated by ChatGPT and other generative models. In this moment of technological change, I think it’s helpful to reflect on the challenges our professional community has already faced and solved together. I was fortunate to present a keynote talk on this topic, and I will summarise my key themes from that presentation: Breakthrough Moments in Enterprise Taxonomy Management

In my presentation, and in this blog, I outlined the challenges our community has successfully addressed, and continues to address, and highlighted a few solutions the Synaptica team has been developing. Specifically, I explore the challenges of complexity, scale, explainability, and trust.

The challenge of complexity

Knowledge models for taxonomy have responded to an increasing need for sophistication and expressiveness. The breakthroughs outlined in this section are the effort of the entire community, especially those individuals and companies who have contributed to industry standards and specifications like ANSI-NISO Z39.19, ISO 25964, and W3C RDF, SKOS, and OWL.

Terms to concepts

The journey starts with the migration from term-based taxonomies to concept-based taxonomies enabled by SKOS.

This simple step-change separated the identifier for each concept from its lexical labels. With RDF, these identifiers are represented with HTTP-URIs.

The community gained many advantages from this breakthrough. For example, the method for managing multilingual taxonomies became massively simpler (once concept could now hold multiple language labels), resulting in huge cost savings compared with the previous data model.

Taxonomy to ontology

Next came the migration from taxonomy to ontology.

The widespread adoption of RDF as a semantic data model transformed simple taxonomies into extensible and expressive ontologies. Once the semantics of the knowledge organization system was no longer confined to hierarchical and associative relationships, people could define any named relationship between things, and define classes of things with distinct sets of properties. Not only are these ontologies capable of representing any domain of knowledge, but the structure of the data model is also transparent and intelligible to both humans and machines.

Ontologies in RDF also support machine inferencing, allowing new knowledge to be derived from existing data.

Labels as things

Another step-change came when labels became things that can have their own properties. The data model evolved to support SKOS-XL.

A concrete example helps to explain the breakthrough. The person with the birth name ‘Stephen King’ wrote a novel called The Shining. The same person also wrote a novel called The Running Man, but he did so under the penname ‘Richard Bachman’. With SKOS-XL, Stephen King and Richard Bachman are bound together as one concept, but each label also has a unique URI making it a thing rather than a string, which enables each name to have independent properties, such as ‘Stephen King’ authorOf The Shining, and ‘Richard Bachman’ authorOf The Running Man.

Relationships as things

It’s perhaps not surprising what came next. With RDF-Star, relationships became things. Using RDF-Star, the relationship joining two concepts (in graph speak, the edge between two nodes) can carry its own set of properties.

Again, a concrete example will help explain why this was a powerful breakthrough. Standard RDF lets us express the following statement ‘Stephen King’ authorOf The Shining. With RDF-Star we can add a date to the relationship between the person and the book, and state that ‘Stephen King’ published The Shining in 1977.

Knowledge graphs

The final step in model complexity covered in my talk is the advent of enterprise knowledge graphs, and, more specifically, what Synaptica calls content-aware knowledge graphs.

What do we mean by content aware knowledge graphs? When our auto-categorization engine, Graphite Knowledge Studio, tags content with metadata derived from taxonomies, we also capture and store this content metadata inside the knowledge graph, thereby linking concepts to the content they describe. As more content gets tagged, the knowledge graph expands, providing a powerful resource for business insights and analytics as well as powerful business functions like similarity indexing and recommendations.

With great solutions, come new challenges . . .

As the ontological schema have successfully evolved to manage increasingly complex knowledge models, the user experience challenge increased. How can tooling and interfaces make what is inherently complex appear simple and comprehensible?

Synaptica’s Graphite application provides a collaboration space for knowledge engineers and content managers to create and manage enterprise taxonomies and ontologies. Graphite’s intuitive drag-and-drop user interface provides a simplification layer on top of complex semantic data models, enabling non-experts to rapidly design and build standards-compliant knowledge organization systems.

As I will explore below, we have brought this same investment in usability to our newest product, Graphite Knowledge Studio, to embrace the same diverse user community we have long served through Graphite.

The challenge of scale

Some organizations will also face the challenge of scale.

Extreme scale

Many, if not most, enterprise taxonomies have under 10,000 concepts. This is depicted by the barely visible red bar on the left below. Synaptica has a few clients with large taxonomies, on the order of 100,000 concepts. This is represented by the small bar in the middle below. This year Synaptica on-boarded clients with extreme scale taxonomies, in the order of 10 million concepts. This is represented by the tall bar on the right below.

Responding to this challenge, Synaptica’s engineers and design team set about a massive re-engineering project. It involved refactoring queries to deliver performant search and browse response, but it also required us to rethink the user experience and workflow. What happens if a user is clicking through a hierarchy and lands on a concept with a million related concepts? What happens when a concept can participate in hundreds of different relationship types? Our engineers developed innovative solutions to each of these challenges.

Adaptive navigation

Graphite now features Adaptive Navigation when exploring both hierarchical and associative links. Click-by-click, Adaptive Navigation is dynamically assessing the scale of data beneath each link.

If it detects scale that cannot be meaningfully rendered as a browse experience, then it adapts by switching the user to a search-inside control. Search-inside searches inside the set of concepts that are beneath a concept in a hierarchy or via an associative relationship.

In the example below a concept is related to 1,058,231 other concepts. Entering a keyword into the search-inside control filters the results down to a manageable 95 related concepts.

The user interface also adapts to the scale of relational predicates. SKOS has 12 relationship types, and though most enterprises have some additional relationships, the scale of these is still generally manageable for the UI to display all types and provide drag-and-drop functionality. One of the extreme scale taxonomies used by a Graphite client has 662 relational predicates. Instead of attempting to display all when only a few are needed, the UI adapts to display only populated predicates while allow the user to quickly add additional ones.

Search performance

The last breakthrough for the challenge of scale is search performance. Even though extreme scale taxonomies are thousands of times larger than normal enterprise taxonomies, taxonomists still require performant response times on these taxonomies for the tooling to be usable.

Synaptica engineers responded to this need with optimized search indexes and queries. In benchmark tests against taxonomies with over 7 million concepts, sub-second response times are delivered where the number of concepts returned is under 100, and results sets containing thousands of concepts are returned in under 4 seconds.

The Challenge of explainability

With increased reliance on machine learning algorithms, the community faces the newest challenge: that of explainability.

As we extend our products to harness the power of machine learning scale, Synaptica teams have considered carefully how to best marry the powers of information science with the advantages of data science without falling prey to the “black box.” This thinking is evident in our newest product, Graphite Knowledge Studio – an extension to Graphite, which enables users to tag content with enterprise taxonomies at scale.

Synaptica’s design principle for successful autocategorization systems is based on three pillars:

1. 1. explainable results
  2. transparent rules and
  3. rapid iteration

Explainable results

Within Graphite Knowledge Studio, we have achieved explainable results by giving the taxonomist the ability to inspect inline annotations, relate these inline annotations to inferred document-level classifications (displayed with confidence scores), and relate all tagging results to specific taxonomy concepts and metadata.

This transparency and explainability is key to extending the user community beyond engineers, data scientists, or others with coding and scripting skills. This expansion allows enterprises to better leverage the deep content knowledge of non-technical subject matter specialists while freeing enterprise technical resources (engineers, data scientists) for other projects.

Transparent rules

The configuration of enterprise taxonomies to power these explainable algorithms relies on SKOS and its extensibility through custom properties.

SKOS provides a widely adopted ontology schema for creating taxonomies. Some of its properties and relationships can be used to facilitate autocategorization. For example, SKOS prefLabels and altLabels can be used to find concept matches within document texts. The SKOS hierarchical structure can be used to infer generalized aboutness classifications (e.g., a document that specifically mentions ‘apples’, ‘pears’, and ‘oranges’, could be inferred to be about the more general broader concept of ‘fruit’, even if ‘fruit’ is not specifically found in the document).

But the SKOS specification was written for human readability and falls short when it comes to some of the machine-readable rules needed by auto-categorizers. For example, SKOS doesn’t support positive or negative contexts (essential for disambiguation), or textual patterns for novel entity extraction, or proximity rules and relevance ranking.

Synaptica’s Graphite Knowledge Studio responds to this challenge by extending the SKOS ontology with additional predicates to support the needs of autocategorizers.

The rules are simple and transparent, which empowers the taxonomist with direct control over how the tagger works.

Rapid iteration

The third pillar for successful autocategorization is the ability to support rapid iteration. With explainable results and transparent rules, the taxonomist can quickly jump back from reviewing inline annotations to modifying the tagging rules in the taxonomy management system. As soon as the taxonomy is modified the changes are immediately available to the autocategorization tagger, enabling the taxonomist to see the results and make further refinements if required.

You can read more about the experience of a Taxonomist managing this iterative process in a recent blog post by Sarah Downs, our Director of Client Solutions: Graphite Knowledge Studio: Putting Taxonomists in the Driver’s Seat

The challenge of trust

Our last challenge is the challenge of trust, which has come to the forefront as people adopt Large Language Models (LLMs). Generative AI in general and LLMs specifically pose a challenge to data privacy, the opaqueness of data sources, and the veracity of results.

Opaqueness of data sources

Synaptica’s technology partner Squirro tackled opaqueness of sources head on when it introduced its LLM implementation called SquirroGPT. Synaptica have embraced this technology and are currently developing a chat-based interface to Synaptica’s user guides, tech docs, and public facing blogs and website articles. In the example below, we asked SynapticaGPT to explain the difference between when to use SKOS and when to use OWL. The short summary it provides is pretty accurate, but what we’ve highlighted here is how the GPT cites the sources it used to generate the answer. Linking answers to sources goes a long way toward establishing trust with Generative AI (GenAI) technology.

Veracity of Results

The veracity of GenAI output exposes another trust challenge to the adoption of tools like ChatGPT.

In the example below I asked the Microsoft Bing Image Creator to create images of “Dean Allemang explaining LLMs” (Dean delivered a brilliant keynote on ontologies and LLMs on day one of Taxonomy Bootcamp).

The image generator did not have an image of Dean Allemang so it created images it thought were appropriate (i.e., it made them up). This phenomenon has been widely discussed this year. As millions of people started using ChatGPT, we quickly discovered that while ChatGPT’s general knowledge is incredible, it also makes up answers that are not evidenced-based and delivers the answers with convincing confidence creating distrust as to whether any particular answer is true or made up.

One way to improve the veracity and relevance of LLMs is to combine them with taxonomies and ontologies. This was one of the key themes in Dean’s keynote (he has also published a research paper with benchmarks which is publicly available at https://arxiv.org/abs/2311.07509).

Three memorable takeaways from Dean’s Taxonomy Bootcamp keynote (my paraphrase):

Who understands why we need taxonomy? ChatGPT does

Combining LLMs with an ontology can massively boost accuracy (37%+ in a specific test case)

You won’t lose your job to an AI; you’ll lose it to a person who knows how to use AI . . . a call to action for all taxonomists.

Balancing trust with the power of LLMs

Synaptica’s Graphite Knowledge Studio enables enterprises to both perform taxonomy-based categorization and access LLMs to support novel entity extraction by framing discovery questions that the LLM will use to identify entities not yet know to the taxonomy.

Enabling breakthroughs with tech stack diversity

As an important (and adorable) aside: I opened my keynote with a photo of my two beautiful dogs (Maximus and Huxley), together with their ‘Heinz 57’ genetic make-up.

I assure you this is an actual photograph. I received many questions at KMWorld whether this was AI-generated art.

More importantly, this image serves as a metaphor for tech stack diversity. An important aspect of Synaptica’s technology strategy is to provide a platform of diverse tooling that is flexible and open to newcomers. NLP and AI have been around for decades, but the pace of development has accelerated to warp-speed. Our Graphite taxonomy management system is designed to connect to an expanding eco system of NLP and LLM tools such as SpaCy and OpenAI’s ChatGPT.

This tech stack diversity and the flexibility it enables have been key to our ability to respond to the challenges and breakthrough moments for enterprise taxonomies:

The challenge of complexity

The challenge of scale

The challenge of explainability

The challenge of trust

Synaptica has been collaborating with businesses and organizations around the world for over twenty-five years to solve the evolving challenges of enterprise taxonomy management and semantic categorization.

It is an ongoing journey of innovation. If your organization is embarking on or managing taxonomy and categorization projects, then please reach out to the Synaptica team. We will be happy to share our experience and would like to show you the solutions we have developed.

The post Breakthrough Moments in Enterprise Taxonomy Management appeared first on Synaptica.

Insights Interview | Patrick Lambe

Vivs Long-Ferguson — Thu, 19 Oct 2023 11:35:29 +0000

We talked again with Patrick Lambe, this time focused on his new book, Principles of Knowledge Auditing: Foundations for Knowledge Management Implementation. We discussed knowledge audits, the importance of context, and organizational change. Patrick is also author of Organizing Knowledge (2007) and co-author of The Knowledge Manager’s Handbook (2019).

What have you been working on since our last interview?

PL: Recently we have been working with inter-governmental organisations on knowledge audits. Working with these projects has reminded me of the importance of organizational culture in knowledge management. Since the pandemic, the projects have been getting smaller. We aren’t seeing the big projects anymore. Organizations are seeking to adjust and refocus.

We find with inter-governmental organizations the strategy is determined by the governments or their representatives, member-states, and stakeholders. This means strategy and objectives are highly negotiated and the downside can be slowness to adopt change. It feels these organizations are structured to minimise the effects of change. This is an interesting challenge in knowledge management because the objective of most knowledge management activity is change.

Individuals respond to change if they can see the purpose and are supported throughout the process. Too often we design change and don’t understand the context or communicate “the change in context” well. People who are expected to change their behaviours are often uninvolved in the design of the change. They have competing demands coming at them from different directions.

Tell us about how this new book on Knowledge Auditing evolved?

PL: After my first book, Organizing Knowledge, I thought it would be a good idea to write a similar book on practical methods for knowledge auditing. I thought I had a fairly robust methodology, derived from the classic information auditing process, but extended to knowledge management.

The more I researched for this new book the more I realized that the information audit tradition was just one practice of auditing relating to knowledge use in organisations.

There are several traditions which date back several decades. Communication audits started in the 1950s. Communication at that time was about the flow of information to decision makers. Then information audits emerged in the 1970s and 1980s. Then knowledge audits in the 1990s and 2000s. Each of these had developed more or less independently of each other. They had their own traditions, their own principles, frameworks and methodologies. What I imagined as a simple practice was in fact a whole landscape of competing definitions and different assumptions about what was meant. I realised it was just not possible to write that book without first making sense of that landscape. Which is why this latest book is about Principles of Knowledge Auditing. The original book idea is still on my agenda but I’m about 80% done on another book focused on knowledge mapping before I can get to the original book idea. Knowledge mapping is a whole other practice area within knowledge auditing.

”Individuals will change if they can see the purpose and are supported throughout the process. Too often we design change and don't understand the context or don't communicate “the change in context" well.

You reference several types of audits in books. Can you describe what a knowledge audit is?

PL: A knowledge audit can be many things. In the book I describe several different forms. These are guided by purpose and they help take stock of your environment whether it’s knowledge, resources, or activities based on the understanding of your environment.

You can approach this exercise in a number of different ways. The simplest, or foundational way is to do an inventory audit of your content. There are limitations with this approach – it will only cover what’s visible. You don’t get a salient sense of the most current content, or the non-tangible knowledge that people use to perform their work. When we’re doing inventory audits, we do knowledge mapping, which covers both explicit and tacit knowledge, and mapping starts with establishing the context of knowledge use.

What’s the activity you’re performing and what are the knowledge resources that you use to do that?

What is being performed, who is performing it, what knowledge are they relying on to perform those tasks?

If you want to understand how well you are doing in terms of your current knowledge and information flows, you might want to do a diagnostic around the pain points that people uncover including:

cultural behaviours
knowledge management processes
capabilities in terms of use of technology
methods for encouraging sharing

These are evaluative types of audits. Discovery review audits that represent what it is you’re doing, how you’re doing it and based on your needs and goals, what you should do next.

Then there are more formal audits where you are auditing to a standard or benchmarking against a set of external practices. Here you may be looking at how you extract value from your knowledge resources. You can use these distinct audit types individually or in combination. It all depends on the purpose and goal for why you are doing the audit in the first place.

For example, a large organization looking for recognition for their knowledge management program may opt for a standards-based audit like ISO30401 management systems audit. On the other end of the spectrum, you may be new to KM and you’re not sure where to start. In this case you might combine an inventory audit and a discovery review audit, just to take stock of what you have and where your opportunities for improvement are.

It’s a good idea to use a combination of audits when you want to bring about change. The most interesting elements of knowledge work in organizations are often not easily observed. You need different methodologies to get different perspectives and look for the common patterns. It’s a kind of triangulation technique. A single audit instrument will not tell you everything that you need to know if your objective is to bring about real and useful change.

”You need distinct methodologies to obtain diverse perspectives and look for the common patterns. A single audit instrument will not tell you everything that you need to know if the objective is to bring about real and useful change.

How can you use the knowledge audit tools and methodologies you described to improve their taxonomies and ontologies?

PL: Taxonomy is still perceived by some as a technical discipline about defining terms and relationships. Understanding the context in which the taxonomy will be used has received relatively little attention in the past.

Testing mechanisms, use cases, well developed scenarios representing real people undertaking real work are not always in place. They are not driving the design or validation of the taxonomy.

Building a sense of context can come directly out of a knowledge audit. Building a rich sense of context is exactly what a knowledge audit is intended to achieve. Knowledge mapping as part of an inventory audit also builds out contemporaneous descriptions of the key knowledge resources used to perform key activities. They form a rich evidence base for taxonomy design. This is why the majority of our taxonomy work is often a follow through from a knowledge audit. Our clients benefit from two outputs; a knowledge management strategy to direct the purpose of the taxonomy, the evidence base for the taxonomy design, and the context descriptions to build use case scenarios for testing.

What did you learn from your research that holds relevance for the practice of knowledge management today?

PL: I learned a lot from the practice of communication audits in the 1950s, 60s and 70s. There was some really fascinating work on methodologies for understanding communication and knowledge flows in organisations that has got lost. Currently these are not widely available or broadcast in the knowledge management space.

In knowledge audits we rely too much on surveys and interviews. Surveys are not great methods for understanding the particularities of an organization’s working context. A survey can only ask questions that you already predict are going to be relevant. You’re not going to discover anything surprising or new.

Interviews are typically with senior managers who have their own, not necessarily well-informed opinions on what should happen. These opinions may compete with the opinions of other stakeholders. You have no basis for resolving those into a common picture. These methods provide the backbone for a lot of knowledge audit practice, but they are not systematic or evidence based.

What I learned most from studying earlier forms of communication and information audits was the range of available group-based methodologies for developing a well-founded understanding not just the of organization’s contexts but also its needs and opportunities.

These are participative methods which means that they involve the people who do the actual work in representing their work and their needs, and then collaborate with us in designing the change. They can tell us about the pain points, what’s working and what’s not working. This is a much stronger basis for identifying commonalities and recommendations that might work. Then the individuals who helped you build that picture are going to recognise the expected change when it comes along. It will make sense to them.

In the book you describe methods and case studies using questions. What makes a good investigative question?

PL: Technically, a good investigative question is one that tells you something useful that you didn’t already know or necessarily predict. Why? Is a very good question in the right context. You’re going to be asking questions like

How do you do this?

Why do you do it?

What follows from that?

Who/what else depends on this process?

You need to ask broad questions about the activity that you’re meant to be supporting in order to understand it. Use the type of questions that help build that out; What? When? Where? Why? How? You can then undertake fact-based surveys-based questions in order to validate and understand this in depth but the real goal is to understand the context of work in new ways.

What challenges in KM are shared by Taxonomists?

PL: Like knowledge managers, taxonomists must figure out how to use technology to help individuals and organizations gain access to knowledge and information. A major issue, and interesting area in knowledge management is the work around intangible knowledge. This is also often difficult to represent through taxonomies: representing and making accessible the knowledge that people use, those informal and undocumented interactions.

Another area shared between taxonomy and knowledge management is the context sensitivity problem: especially when the rules of the game change. This can be the operating context, the people themselves, their roles, or the activity that they’re deployed in. Both disciplines can do a lot of running around without making progress because you haven’t fully understood the context. There might be critical elements that you haven’t seen or taken into account.

”Taxonomy technologies can enable various aspects of the work of knowledge manager. Taxonomy is particularly useful at managing information. It's good at helping people communicate and collaborate with each other.

How can ontologists use these techniques in developing, improving and understanding their work?

PL: Start with the knowledge mapping process. When you start with an activity then you ask what is the knowledge you need? What resources are needed to perform this activity? Who else do you interact with to perform this activity? This approach gives you:

A clear agreed context from two or three people who perform the same activity to provide a shared, well-founded understanding of the activity.
How that relates to other knowledge resources.
The language that is used to describe those resource.
This provides the raw material for the taxonomy or ontology.

How can technology enable the work of knowledge managers?

PL: We need to recognise that technology both enables and disables. There is a flawed habit of investing in technology, throwing it at the organization and expecting everything and everyone to adapt itself around this technology. Sometimes it just doesn’t take, sometimes it’s useful, and sometimes it disrupts.

Knowledge management practice tends to be led by the technology. We know that it’s hard to develop a good taxonomy for an organization using SharePoint. SharePoint does not accommodate all the useful features of a standards-based taxonomy. It doesn’t handle synonyms well or allow you to map related term relationships between terms or represent polyhierarchy. Yet there it is, it’s a fact of life we have to deal with.

In principle, taxonomy technologies can enable various aspects of the work of knowledge manager. Taxonomy is particularly useful at managing information. It’s good at helping people communicate and collaborate with each other. It can be useful in identifying pockets of expertise in the organization. However, this is limited because the technology relies heavily on explicit knowledge activity. It doesn’t capture the implicit non-visible information. We think we can see everything through the system but the reality is we can’t.

Technology can be useful in the learning and improving strand of knowledge management. There are text analytics tools available now capable of crawling through project reports, evaluation reports, and identifying common lessons and patterns. These enable analysts to do a preliminary gathering of content and figure out common patterns that the organisation needs to learn from. Technology can be useful in offering different ideas and options, e.g., in looking at your knowledge risks, maintaining critical knowledge, learning and improving, supporting innovation and change.

When you select a technology tool have a clear use case(s) and assess the technology on those examples. Will it perform to the organization’s requirements? Is it is actually doing what you want it to do and what you’re expecting? We use Synaptica software exclusively when working on taxonomies in in the context of knowledge management.

”When you select a technology tool have a clear use case(s) and assess the technology on those examples. Will it perform to the organization’s requirements? Is it is actually doing what you want it to do and what you're expecting?

Principles of Knowledge Auditing: Foundations for Knowledge Management Implementation is available from Knowledge Auditing. You can read the previous Insights Interview with Patrick Lambe here.

The post Insights Interview | Patrick Lambe appeared first on Synaptica.

Insights Interview | Bonnie Bowes

Vivs Long-Ferguson — Tue, 26 Sep 2023 13:25:32 +0000

We interviewed Bonnie Bowes, a Taxonomist and Information Architect Specialist with over ten years of experience developing and managing enterprise taxonomies. Bonnie is Product Taxonomist at Checkatrade. Prior to Checkatrade she has worked at Meta (Facebook), Sony PlayStation, Apple, and Chevron. Bonnie holds a MA in Library and Information Science from San Jose State University and a BA in History from George Mason University.

Tell us about you and your experiences?
BB: I’m an American living in London. I moved from California to the U.K. with my cat Juno in August 2022. I grew up in Kansas, in the middle of the US, and moved to Washington, D.C., to attend college, and then to California for graduate school. After graduate school I continued to live and work in the San Francisco Bay Area for many years before relocating to London.

I’m interested in linguistics and etymology, learning about the root meaning of words and how they generate mental images that translate into things and concepts. As a kid I spent a lot of time at the public library. I figured out how to find books on my own. From an early age I was drawn to libraries, collections of things, classification systems. At university I studied Latin and found the history of words and how they are related fascinating. I wanted a career that reflected my interests, and I wanted to work with either books or art. Library and Information Science was a natural fit for me.

As an undergraduate I worked at the National Gallery of Art Library in Washington DC. It was a life-changing experience. Working in a prestigious library offered a peek into a certain type of career I hadn’t known existed or was otherwise out of reach. To me the art librarians were so glamorous and sophisticated. One of my tasks was opening packages in the mail room and placing them on the right shelf by auction house. Let me tell you, it was an incredible mailroom job. I had access to rare books and artifacts. I learned to assist the cataloguers and how to archive exhibitions. I wanted to work at the National Gallery of Art Library forever. But I couldn’t because I was on a student work program. My experience at the National Gallery of Art Library influenced my decision to change from Law to Library and Information Science. I thought the world could use another librarian more than it could use another lawyer.

How did your career as a Taxonomist develop?
BB: Many taxonomists find their calling by accident. For me, becoming a professional taxonomist was intentional. I had specific career goals in mind and devised a plan. After college I moved from Washington DC to California. The 2008 American recession was in full swing. I was stuck at a low paid job with no prospects. Then Google started hiring librarians.

Google was in a partnership with Harvard University at the time, working on this great digitization project. They were digitizing all of the works of literature available in the public domain, and it was huge. It was the first time I saw librarians in the news paired with tech. What was interesting to me was Google didn’t refer to them as librarians, although they were. They were calling the librarians “taxonomists.” I thought to myself that’s what I want to do! I want to be a librarian in tech. In graduate school I focused on the more technical courses such as database systems, digitization, information theory, programming, metadata management.

My first professional taxonomist role was as a digital archivist at Chevron, where I learned cataloging, controlled vocabulary, and taxonomy design under the mentorship of the head digital librarian, Ron. He ran a tight ship and adhered to strict best practices, making it an excellent early professional experience. This strong foundational learning shaped my practices for my future career.

From Chevron I went on to work in digital asset manager and taxonomist roles at other technology companies. I was recruited for and took on more senior roles. I worked on several types of taxonomies for very different products. I was the first taxonomist hired at two of the companies. I pioneered taxonomy teams and advocated for a taxonomy community of practice. Being in the right place at the right time helped, meaning Silicon Valley during a time when companies started to really value the information science skillset. Corporate investment in big data, digitization, asset management E-commerce, and machine learning led to new jobs in the field, including taxonomist roles.

What do you look for as a customer of Taxonomy software? What do you expect from a tool?
BB: At my previous company I made a successful business case for licensing specialized taxonomy software – the first in the company. The project was to design a taxonomy out of 300k random terms, a crazy flat list, so professional software was obviously needed. I made a requirements list and ranked them into ‘must have,’ “nice to have,” “could have” buckets. I used the Synaptica Taxonomy Top 100 Checklist which is a fantastic starting point for anyone evaluating a new taxonomy tool. It was a key reference resource for moving away from spreadsheets into a professional taxonomy software tool. Assuming you already have basic functionality, my taxonomy tool requirements shortlist includes:

Batch editing. Specifically the ability to integrate, merge, deprecate and restore terms in large batches with minimal clicks. The ability to batch edit is critical, because taxonomies have reached such massive sizes any management tool needs to be able to edit at scale. Ideally, you can do all of your editing in-tool and not have to download the taxonomy into another software like Excel, make changes, and then reupload into the tool.
The ability to tag and also filter on tags is essential for being able to isolate specific areas of the taxonomy.
The tool must be collaborative to support cross-functional team adoption. If an editor cannot leave comments to another editor, it won’t be viewed as a collaborative environment.
The tool should offer customizable output file format options that meet the end users’ requirements for transfer and upload. Unless you have a robust API from the get-go, you’re going to have to use the input / output file protocol to deliver the taxonomy to your engineering team or whoever and they will have specific ingest format requirements.
The ability to link, map and integrate multiple taxonomies is important with the type of work I usually do. Does it offer API support?
Robust reporting and analytical features are important. Can the tool generate its own reports, or will you have to export it into another tool to run reporting?
Can the tool support an API? You’ll need an API for the internal servers to make real-time changes and avoid import/export delivery.
Finally, is there a customer success package or support that’s included in the contract? Unless your internal engineering team is willing to provide tech support every time there is an issue, you’ll need a reliable customer service package you can turn to when things break, or questions arise.

Why did you choose Synaptica?
BB: I knew from experience that having the right functionality for the company’s unique business needs, extreme reliability, and an extensive customer support package are important. Software is never what you need it to be right out of the box. There has to be a continual relationship between business owner and vendor to ensure that the tool develops, that it continues to do what we need it to do, as project needs change. I wanted a tool with a vendor that was willing to enter a partnership with the taxonomy team for the continued development and adoption of the tool. Synaptica were willing to do that. Put all this together and that’s been our leverage, our great success.

”I would never want a taxonomist to have to go through a situation where they didn’t have the right tool to perform the task. I made a vow that it wouldn't happen again to my team, that we would build taxonomies using proper tools from then on.

You collaborated with Synaptica to develop custom functionality and an auto mapping tool. Can you tell us about that experience?
BB: The team needed the ability to quickly map and integrate large taxonomies to meet our goals. On the scale we were working with, it wasn’t possible to do manually. We needed an automated mapping tool that could understand the contextual meaning of the terms, not string matching. Take a term like “pluto” which has four meanings. “Pluto” can refer to a Roman god, a short name for plutonium, a Disney dog character, or a planet. Mapping terms by string matching “pluto” doesn’t work.

We developed a method where the taxonomy tool looks at other characteristics of the term and the node path: including the broader terms, the narrower terms, the source URL, the category. It looked at patterns and similarities of the content. Based on the contextual findings, the tool assigned a mapping confidence score. Let’s say for a good confidence score based on similar broader and narrower terms and categories would be 90% and higher suggests an accurate match. This way we could automatically map tens of thousands of nodes in a batch. And rather than spending weeks or months reviewing the mapped nodes manually, we could approve all mappings with a 90% or above confidence score. That way, the team only had to spend time manually reviewing the lowest confidence scored mappings.

The ability to auto-match saved significant time and increased productivity. We could also determine the degree of the term associations, whether the mappings should be fuzzy, exact, or related, depending on the purpose of the project. Being able to map and integrate large taxonomies quickly and accurately made it possible for us to offer taxonomy mapping as a service to other teams.

Without the automated mapping tool, the same mapping projects would have taken months or years to map. With the auto match tool, it took a matter of weeks to undertake accurately.

What is it like to design a taxonomy in spreadsheets versus using specialized software?
BB: For me, designing a large taxonomy using spreadsheets was a nightmare. I once revamped a taxonomy that had over ten thousand nodes in Excel. It took three months of single focused work and resulted in nearly fifty versions before it was ship ready. It was one of the most painstaking, frustrating experiences of my career. You’re trying to add nodes, removing them, merging them – trying to keep all the node ids intact. It’s difficult to keep the IDs associated to the right nodes while cutting and pasting in a spreadsheet. This is not what spreadsheets are meant to do.

When it was time to deliver the taxonomy, the IDs were not correct from all the cut and paste editing. This meant the engineers had to take time to figure out a solution to this problem. When you have five brilliant engineers who have to stop what they’re doing to figure out how to get nodes matched to the right node ID, that’s a huge drain on ENG resources. The project almost didn’t launch. I would never want a taxonomist to have to go through a situation where they didn’t have the right tool to perform the task. I made a vow that it wouldn’t happen again to my team, that we would build taxonomies using proper tools from then on.

”I wanted a tool with a vendor that was willing to enter a partnership with the taxonomy team for the continued development and adoption of the tool.
Synaptica were willing to do that.

How do you design for machines?
BB: You design for machines by employing rules and consistent logic the machine can understand. I begin by defining the domain the taxonomy describes. The rules for inclusion; what it is and is not. Next, I imagine I am designing for an alien whose language is logic, patterns and association. Information architecture structures are concerned with grouping things based on mental models or their relationship to other things. Taxonomies are concerned with precise classification and logical precision. This allows machine learning models to infer content understanding.

What level of granularity machine tagging can support in a taxonomy?
BB: Taxonomy can support machine tagging at any level, but inference is more accurate at the higher levels. For example, it’s easier to tag whether something is a plant or an animal than the type of species. For deeper taxonomies that require eight or more levels to describe a domain the sweet spot is usually around level four. Granularity is a trade-off; it allows more precise description but at the cost of less accurate inference. What is confusing for a human is even more confusing for a machine. We understand well what a concept or thing is at a broad level, it becomes more complex the more granular we deconstruct it. When thinking about granularity, keep an even level throughout the taxonomy and be intentional about going too broad or too granular. What level is necessary to describe the domain?

How do you use engineered solutions in partnership with human curation?
BB: Human curation is helpful to engineering solutions in situations where human nuance presents inconsistent or fuzzy logic. Curation informs ranking and relevance logic. For example, algorithms need to learn how to rank what is most important, “of all the data available, this is what we care about most”. An E-Commerce algorithm may determine “color” to be an important product attribute. But for which products? Human curation may inform the algorithm that “color” is an important attribute for curtains but not for books.

To improve search results human curation will determine the rules for how backend tables are indexed, and how keywords are weighted to retrieve and display what is most important at the top.

From the taxonomy side, human curation helps determine the right access points and level of node granularity to display in the UI. When working with clustered terms, human curation is needed to determine the granularity level of the cluster, and what to call it. In short, human curation is required whenever an understanding of human mental models is necessary. Machines cannot rely solely on logic.

”Taxonomies are essential for any organization that relies on data and information. Data analysis is becoming the basis of the service offer and business models. Data is king. But ambiguous data is a liability, not an asset.

Why are taxonomies important to organizations?
BB: Taxonomies are essential for any organization that relies on data and information. Data analysis is becoming the basis of the service offer and business models. Data is king. But ambiguous data is a liability, not an asset.

Taxonomies are uniquely valuable because they are the great disambiguation of the IA world. Taxonomies describe a domain. The domain can be all living things like Linnaeus’s taxonomy of life, or the products sold on an E-commerce platform.

Disambiguation is a specialized craft; it is the first step in transforming data into useful information. First, we need to know what it is and where it came from, and then we can write the rules for inclusion. That’s what taxonomies paired with robust metadata do exceptionally well. Nine million things described by seven levels Kingdom-Phylum-Class-Order-Family-Genus-Species – boom-boom-boom.

How should an organization develop a strategy for taxonomy?
BB: Conducting a taxonomy landscape audit is a good place to start. If your organization employs taxonomy in any form, then you want to know what taxonomies are being used, where and how.

For example, while working with an E-commerce platform, we knew that one taxonomy was used for inference, another taxonomy was visible to the sellers, and possibly a third taxonomy in ads and marketing. The taxonomy landscape audit allowed us to take a holistic view of the platform, to confirm which taxonomies were doing what throughout the funnel and Identify gaps and opportunities.

Once the taxonomy audit is complete, and the goals understood, you can work with a data scientist to develop opportunity sizing metrics related to the goals. This is the first phase for developing a business case for a taxonomy strategy.

How can a taxonomy expert help?
BB: Ideally an expert who designed the taxonomy will know it inside and out. They can see its shortcomings. They will be able to understand the opportunities and the limitations of the solution. Taxonomy is not a magic word; it can’t solve everything. Having an expert on hand to scope the taxonomy solution and provide realistic advice about what is and is not possible, and how to best optimize the taxonomy, is valuable for any taxonomy related project.

”Having an expert on hand to scope the taxonomy solution and provide realistic advice about what is and is not possible, and how to best optimize the taxonomy, is valuable for any taxonomy related project.

What advice do you give to others who are developing their taxonomy project?
BB: I always begin with a content audit before designing or building a taxonomy. It’s the second step after defining the domain. Look at the actual content, interact with it and get as many samples as you can. Terms, scheme and structure all develop from the content the taxonomy describes.

What makes a good taxonomist, what skills are needed?
BB: A successful taxonomist would have solid research and strategy skills, as well as an understanding of information science theory. Classification, semantics, linguistics and to be comfortable working with technology. Patience and the ability to do monotonous work also help.

As a taxonomist you should have the ability to work well with other professions. Whoever you’re supporting, whether it be engineers, scientists, doctors, technicians, – we must be able to speak their language. Whichever sector you support, learn corporate vocabulary. Know the basics. Embed yourself into the profession your taxonomy supports.

What do you think are the biggest challenges for taxonomy in the future for the sector
BB: Remaining relevant and useful. There is a danger in becoming too comfortable in a routine way of doing things, of not stretching your skill set or embracing inevitable sector changes. Learning hard skills like data science and analytics to become more self-reliant. Embracing new technology, adjusting with the changes in the market and the world. We know some of the old ways of doing things no longer require human expertise. That doesn’t mean the writing is on the wall for taxonomy, but we do need to pay attention. Do the university taxonomy courses truly reflect industry demands? Formal training for taxonomy managers would help taxonomy teams to be better functioning. I think of taxonomy as a support science. I have to keep upping my professional learnings to provide solutions to the industry I’m supporting. What I was doing two, three, five or more years ago are not the same things that will work today.

Synaptica Insights is our popular series of use cases sharing stories, news, and learning from our customers, partners, influencers, and colleagues. You can review the full catalogue of Insight interviews online.

The post Insights Interview | Bonnie Bowes appeared first on Synaptica.