diff --git a/docs/docs/api_reference/llama_deploy/message_queues/index.md b/docs/docs/api_reference/llama_deploy/Message Queues/index.md
similarity index 100%
rename from docs/docs/api_reference/llama_deploy/message_queues/index.md
rename to docs/docs/api_reference/llama_deploy/Message Queues/index.md
diff --git a/docs/docs/api_reference/llama_deploy/message_queues/kafka.md b/docs/docs/api_reference/llama_deploy/Message Queues/kafka.md
similarity index 100%
rename from docs/docs/api_reference/llama_deploy/message_queues/kafka.md
rename to docs/docs/api_reference/llama_deploy/Message Queues/kafka.md
diff --git a/docs/docs/api_reference/llama_deploy/message_queues/rabbitmq.md b/docs/docs/api_reference/llama_deploy/Message Queues/rabbitmq.md
similarity index 100%
rename from docs/docs/api_reference/llama_deploy/message_queues/rabbitmq.md
rename to docs/docs/api_reference/llama_deploy/Message Queues/rabbitmq.md
diff --git a/docs/docs/api_reference/llama_deploy/message_queues/redis.md b/docs/docs/api_reference/llama_deploy/Message Queues/redis.md
similarity index 100%
rename from docs/docs/api_reference/llama_deploy/message_queues/redis.md
rename to docs/docs/api_reference/llama_deploy/Message Queues/redis.md
diff --git a/docs/docs/api_reference/llama_deploy/message_queues/simple.md b/docs/docs/api_reference/llama_deploy/Message Queues/simple.md
similarity index 100%
rename from docs/docs/api_reference/llama_deploy/message_queues/simple.md
rename to docs/docs/api_reference/llama_deploy/Message Queues/simple.md
diff --git a/docs/docs/index.md b/docs/docs/index.md
index cc8e96ee..e69de29b 100644
--- a/docs/docs/index.md
+++ b/docs/docs/index.md
@@ -1,361 +0,0 @@
-# 🦙 `llama_deploy` 🤖
-
-`llama_deploy` (formerly `llama-agents`) is an async-first framework for deploying, scaling, and productionizing agentic multi-service systems based on [workflows from `llama_index`](https://docs.llamaindex.ai/en/stable/understanding/workflows/). With `llama_deploy`, you can build any number of workflows in `llama_index` and then bring them into `llama_deploy` for deployment.
-
-In `llama_deploy`, each workflow is seen as a `service`, endlessly processing incoming tasks. Each workflow pulls and publishes messages to and from a `message queue`.
-
-At the top of a `llama_deploy` system is the `control plane`. The control plane handles ongoing tasks, manages state, keeps track of which services are in the network, and also decides which service should handle the next step of a task using an `orchestrator`. The default `orchestrator` is purely programmatic, handling failures, retries, and state-passing.
-
-The overall system layout is pictured below.
-
-![A basic system in llama_deploy](../../system_diagram.png)
-
-## Wait, where is `llama-agents`?
-
-The introduction of [Workflows](https://docs.llamaindex.ai/en/stable/module_guides/workflow/#workflows) in `llama_index`produced the most intuitive way to develop agentic applications. The question then became: how can we close the gap between developing an agentic application as a workflow, and deploying it?
-
-With `llama_deploy`, the goal is to make it as 1:1 as possible between something that you built in a notebook, and something running on the cloud in a cluster. `llama_deploy` enables this by simply being able to pass in and deploy any workflow.
-
-## Installation
-
-`llama_deploy` can be installed with pip, and relies mainly on `llama_index_core`:
-
-```bash
-pip install llama_deploy
-```
-
-## Getting Started
-
-### High-Level Deployment
-
-`llama_deploy` provides a simple way to deploy your workflows using configuration objects and helper functions.
-
-When deploying, generally you'll want to deploy the core services and workflows each from their own python scripts (or docker images, etc.).
-
-Here's how you can deploy a core system and a workflow:
-
-### Deploying the Core System
-
-To deploy the core system (message queue, control plane, and orchestrator), you can use the `deploy_core` function:
-
-```python
-from llama_deploy import (
-    deploy_core,
-    ControlPlaneConfig,
-    SimpleMessageQueueConfig,
-)
-
-
-async def main():
-    await deploy_core(
-        control_plane_config=ControlPlaneConfig(),
-        message_queue_config=SimpleMessageQueueConfig(),
-    )
-
-
-if __name__ == "__main__":
-    import asyncio
-
-    asyncio.run(main())
-```
-
-This will set up the basic infrastructure for your `llama_deploy` system. You can customize the configs to adjust ports and basic settings, as well as swap in different message queue configs (Redis, Kafka, RabbiMQ, etc.).
-
-### Deploying a Workflow
-
-To deploy a workflow as a service, you can use the `deploy_workflow` function:
-
-```python
-from llama_deploy import (
-    deploy_workflow,
-    WorkflowServiceConfig,
-    ControlPlaneConfig,
-    SimpleMessageQueueConfig,
-)
-from llama_index.core.workflow import Workflow, StartEvent, StopEvent, step
-
-
-# create a dummy workflow
-class MyWorkflow(Workflow):
-    @step()
-    async def run_step(self, ev: StartEvent) -> StopEvent:
-        # Your workflow logic here
-        arg1 = str(ev.get("arg1", ""))
-        result = arg1 + "_result"
-        return StopEvent(result=result)
-
-
-async def main():
-    await deploy_workflow(
-        workflow=MyWorkflow(),
-        workflow_config=WorkflowServiceConfig(
-            host="127.0.0.1", port=8002, service_name="my_workflow"
-        ),
-        control_plane_config=ControlPlaneConfig(),
-    )
-
-
-if __name__ == "__main__":
-    import asyncio
-
-    asyncio.run(main())
-```
-
-This will deploy your workflow as a service within the `llama_deploy` system, and register the service with the existing control plane and message queue.
-
-### Interacting with your Deployment
-
-Once deployed, you can interact with your deployment using a client.
-
-```python
-from llama_deploy import LlamaDeployClient, ControlPlaneConfig
-
-# points to deployed control plane
-client = LlamaDeployClient(ControlPlaneConfig())
-
-session = client.create_session()
-result = session.run("my_workflow", arg1="hello_world")
-print(result)
-# prints 'hello_world_result'
-```
-
-### Deploying Nested Workflows
-
-Every `Workflow` is capable of injecting and running nested workflows. For example
-
-```python
-from llama_index.core.workflow import Workflow, StartEvent, StopEvent, step
-
-
-class InnerWorkflow(Workflow):
-    @step()
-    async def run_step(self, ev: StartEvent) -> StopEvent:
-        arg1 = ev.get("arg1")
-        if not arg1:
-            raise ValueError("arg1 is required.")
-
-        return StopEvent(result=str(arg1) + "_result")
-
-
-class OuterWorkflow(Workflow):
-    @step()
-    async def run_step(
-        self, ev: StartEvent, inner: InnerWorkflow
-    ) -> StopEvent:
-        arg1 = ev.get("arg1")
-        if not arg1:
-            raise ValueError("arg1 is required.")
-
-        arg1 = await inner.run(arg1=arg1)
-
-        return StopEvent(result=str(arg1) + "_result")
-
-
-inner = InnerWorkflow()
-outer = OuterWorkflow()
-outer.add_workflows(inner=InnerWorkflow())
-```
-
-`llama_deploy` makes it dead simple to spin up each workflow above as a service, and run everything without any changes to your code!
-
-Just deploy each workflow:
-
-> [!NOTE]
-> This code is launching both workflows from the same script, but these could easily be separate scripts, machines, or docker containers!
-
-```python
-import asyncio
-from llama_deploy import (
-    WorkflowServiceConfig,
-    ControlPlaneConfig,
-    deploy_workflow,
-)
-
-
-async def main():
-    inner_task = asyncio.create_task(
-        deploy_workflow(
-            inner,
-            WorkflowServiceConfig(
-                host="127.0.0.1", port=8003, service_name="inner"
-            ),
-            ControlPlaneConfig(),
-        )
-    )
-
-    outer_task = asyncio.create_task(
-        deploy_workflow(
-            outer,
-            WorkflowServiceConfig(
-                host="127.0.0.1", port=8002, service_name="outer"
-            ),
-            ControlPlaneConfig(),
-        )
-    )
-
-    await asyncio.gather(inner_task, outer_task)
-
-
-if __name__ == "__main__":
-    import asyncio
-
-    asyncio.run(main())
-```
-
-And then use it as before:
-
-```python
-from llama_deploy import LlamaDeployClient
-
-# points to deployed control plane
-client = LlamaDeployClient(ControlPlaneConfig())
-
-session = client.create_session()
-result = session.run("outer", arg1="hello_world")
-print(result)
-# prints 'hello_world_result_result'
-```
-
-## Components of a `llama_deploy` System
-
-In `llama_deploy`, there are several key components that make up the overall system
-
-- `message queue` -- the message queue acts as a queue for all services and the `control plane`. It has methods for publishing methods to named queues, and delegates messages to consumers.
-- `control plane` -- the control plane is a the central gateway to the `llama_deploy` system. It keeps track of current tasks and the services that are registered to the system. The `control plane` also performs state and session management and utilizes the `orchestrator`.
-- `orchestrator` -- The module handles incoming tasks and decides what service to send it to, as well as how to handle results from services. By default, the `orchestrator` is very simple, and assumes incoming tasks have a destination already specified. Beyond that, the default `orchestrator` handles retries, failures, and other nice-to-haves.
-- `services` -- Services are where the actual work happens. A services accepts some incoming task and context, processes it, and publishes a result. When you deploy a workflow, it becomes a service.
-
-## Low-Level Deployment
-
-For more control over the deployment process, you can use the lower-level API. Here's what's happening under the hood when you use `deploy_core` and `deploy_workflow`:
-
-### deploy_core
-
-The `deploy_core` function sets up the message queue, control plane, and orchestrator. Here's what it does:
-
-```python
-async def deploy_core(
-    control_plane_config: ControlPlaneConfig,
-    message_queue_config: BaseSettings,
-    orchestrator_config: Optional[SimpleOrchestratorConfig] = None,
-) -> None:
-    orchestrator_config = orchestrator_config or SimpleOrchestratorConfig()
-
-    message_queue_client = _get_message_queue_client(message_queue_config)
-
-    control_plane = ControlPlaneServer(
-        message_queue_client,
-        SimpleOrchestrator(**orchestrator_config.model_dump()),
-        **control_plane_config.model_dump(),
-    )
-
-    message_queue_task = None
-    if isinstance(message_queue_config, SimpleMessageQueueConfig):
-        message_queue_task = _deploy_local_message_queue(message_queue_config)
-
-    control_plane_task = asyncio.create_task(control_plane.launch_server())
-
-    # let services spin up
-    await asyncio.sleep(1)
-
-    # register the control plane as a consumer
-    control_plane_consumer_fn = await control_plane.register_to_message_queue()
-
-    consumer_task = asyncio.create_task(control_plane_consumer_fn())
-
-    # let things sync up
-    await asyncio.sleep(1)
-
-    # let things run
-    if message_queue_task:
-        all_tasks = [control_plane_task, consumer_task, message_queue_task]
-    else:
-        all_tasks = [control_plane_task, consumer_task]
-
-    shutdown_handler = _get_shutdown_handler(all_tasks)
-    loop = asyncio.get_event_loop()
-    while loop.is_running():
-        await asyncio.sleep(0.1)
-        signal.signal(signal.SIGINT, shutdown_handler)
-
-        for task in all_tasks:
-            if task.done() and task.exception():  # type: ignore
-                raise task.exception()  # type: ignore
-```
-
-This function:
-
-1. Sets up the message queue client
-2. Creates the control plane server
-3. Launches the message queue (if using SimpleMessageQueue)
-4. Launches the control plane server
-5. Registers the control plane as a consumer
-6. Sets up a shutdown handler and keeps the event loop running
-
-### deploy_workflow
-
-The `deploy_workflow` function deploys a workflow as a service. Here's what it does:
-
-```python
-async def deploy_workflow(
-    workflow: Workflow,
-    workflow_config: WorkflowServiceConfig,
-    control_plane_config: ControlPlaneConfig,
-) -> None:
-    control_plane_url = control_plane_config.url
-
-    async with httpx.AsyncClient() as client:
-        response = await client.get(f"{control_plane_url}/queue_config")
-        queue_config_dict = response.json()
-
-    message_queue_config = _get_message_queue_config(queue_config_dict)
-    message_queue_client = _get_message_queue_client(message_queue_config)
-
-    service = WorkflowService(
-        workflow=workflow,
-        message_queue=message_queue_client,
-        **workflow_config.model_dump(),
-    )
-
-    service_task = asyncio.create_task(service.launch_server())
-
-    # let service spin up
-    await asyncio.sleep(1)
-
-    # register to message queue
-    consumer_fn = await service.register_to_message_queue()
-
-    # register to control plane
-    control_plane_url = (
-        f"http://{control_plane_config.host}:{control_plane_config.port}"
-    )
-    await service.register_to_control_plane(control_plane_url)
-
-    # create consumer task
-    consumer_task = asyncio.create_task(consumer_fn())
-
-    # let things sync up
-    await asyncio.sleep(1)
-
-    all_tasks = [consumer_task, service_task]
-
-    shutdown_handler = _get_shutdown_handler(all_tasks)
-    loop = asyncio.get_event_loop()
-    while loop.is_running():
-        await asyncio.sleep(0.1)
-        signal.signal(signal.SIGINT, shutdown_handler)
-
-        for task in all_tasks:
-            if task.done() and task.exception():  # type: ignore
-                raise task.exception()  # type: ignore
-```
-
-This function:
-
-1. Sets up the message queue client
-2. Creates a WorkflowService with the provided workflow
-3. Launches the service server
-4. Registers the service to the message queue
-5. Registers the service to the control plane
-6. Sets up a consumer task for the service
-7. Sets up a shutdown handler and keeps the event loop running
diff --git a/docs/docs/module_guides/workflow/deployment.md b/docs/docs/module_guides/llama_deploy/deployment.md
similarity index 100%
rename from docs/docs/module_guides/workflow/deployment.md
rename to docs/docs/module_guides/llama_deploy/deployment.md
diff --git a/docs/mkdocs.yml b/docs/mkdocs.yml
index 5bc9fe28..e27d4c33 100644
--- a/docs/mkdocs.yml
+++ b/docs/mkdocs.yml
@@ -19,21 +19,11 @@ markdown_extensions:
 nav:
   - Home: index.md
   - API Reference:
-      - Client: api_reference/llama_deploy/client.md
-      - Control Plane: api_reference/llama_deploy/control_plane.md
-      - Deployment: api_reference/llama_deploy/deployment.md
-      - Message Consumers: api_reference/llama_deploy/message_consumers.md
-      - Message Publishers: api_reference/llama_deploy/message_publishers.md
-      - Message Queues:
-          - api_reference/llama_deploy/message_queues/index.md
-          - api_reference/llama_deploy/message_queues/kafka.md
-          - api_reference/llama_deploy/message_queues/rabbitmq.md
-          - api_reference/llama_deploy/message_queues/redis.md
-          - api_reference/llama_deploy/message_queues/simple.md
-      - Messages: api_reference/llama_deploy/messages.md
-      - Orchestrators: api_reference/llama_deploy/orchestrators.md
-      - Services: api_reference/llama_deploy/services.md
-      - Types: api_reference/llama_deploy/types.md
+      - Llama Deploy:
+          - api_reference/llama_deploy
+  - Component Guides:
+      - Llama Deploy:
+          - module_guides/llama_deploy
 plugins:
   - search
   - include_dir_to_nav